Yield and stress tolerance in transgenic plants iv

ABSTRACT

Polynucleotides and polypeptides incorporated into expression vectors have been introduced into plants and were ectopically expressed. The polypeptides of the invention have been shown to confer at least one regulatory activity and confer increased yield, greater height, greater early season growth, greater canopy coverage, greater stem diameter, greater late season vigor, increased secondary rooting, more rapid germination, greater cold tolerance, greater tolerance to water deprivation, reduced stomatal conductance, altered C/N sensing, increased low nitrogen tolerance, increased low phosphorus tolerance, or increased tolerance to hyperosmotic stress as compared to the control plant as compared to a control plant.

RELATIONSHIP TO COPENDING APPLICATIONS

This application is a divisional application of prior-filed U.S. patentapplication Ser. No. 11/821,448, filed Jun. 22, 2007 (pending). U.S.patent application Ser. No. 11/821,448 claims the benefit of U.S.provisional application 60/817,886, filed Jun. 29, 2006 (expired). U.S.patent application Ser. No. 11/821,448 is also a continuation-in-part ofprior-filed U.S. patent application Ser. No. 11/642,814, filed Dec. 20,2006 (pending), which is a divisional application of prior-filed U.S.patent application Ser. No. 10/666,642, filed Sep. 18, 2003, and whichissued as U.S. Pat. No. 7,196,245 on Mar. 27, 2007, the latterapplication claiming the benefit of prior-filed U.S. provisionalapplication 60/411,837, filed Sep. 18, 2002 (expired), U.S. provisionalapplication 60/434,166, filed Dec. 17, 2002 (expired), and U.S.provisional application 60/465,809, filed Apr. 24, 2003 (expired). Theentire contents of each of these applications are hereby incorporated byreference.

JOINT RESEARCH AGREEMENT

The claimed invention, in the field of functional genomics and thecharacterization of plant genes for the improvement of plants, was madeby or on behalf of Mendel Biotechnology, Inc. and Monsanto Company as aresult of activities undertaken within the scope of a joint researchagreement in effect on or before the date the claimed invention wasmade.

“REFERENCE TO A “SEQUENCE LISTING,” A TABLE, OR A COMPUTER PROGRAMLISTING APPENDIX SUBMITTED ON A COMPACT DISK

The Sequence Listing written in file—9-1. APP, 86,016 bytes, created onJun. 19, 2007 on duplicate copies of compact disc of the written form ofthe Sequence Listing, i.e., “Copy 1 of 3” and “Copy 2 of 3”, and thesequence informution recorded in computer readable form on compact disc,i.e., “Copy 3 of 3” for Application No: 60/17,886, Creelman et al.,IMPROVED YIELD AND STRESS TOLERANCE IN TRANSGENIC PLANTS, is herebyincorporated by reference.

FIELD OF THE INVENTION

The present invention relates to plant genomics and plant improvement.

BACKGROUND OF THE INVENTION The Effects of Various Factors on PlantYield

Yield of commercially valuable species in the natural environment may besuboptimal as plants often grow under unfavorable conditions, such as atan inappropriate temperature or with a limited supply of soil nutrients,light, or water availability. For example, nitrogen (N) and phosphorus(P) are critical limiting nutrients for plants. Phosphorus is secondonly to nitrogen in its importance as a macronutrient for plant growthand to its impact on crop yield. Plants have evolved several strategiesto help cope with P and N deprivation that include metabolic as well asdevelopmental adaptations. Most, if not all, of these strategies havecomponents that are regulated at the level of transcription andtherefore are amenable to manipulation by transcription factors.Metabolic adaptations include increasing the availability of P and N byincreasing uptake from the soil though the induction of high affinityand low affinity transporters, and/or increasing its mobilization in theplant. Developmental adaptations include increases in primary andsecondary roots, increases in root hair number and length, andassociations with mycorrhizal fungi (Bates and Lynch (1996); Harrison(1999)).

Nitrogen and carbon metabolism are tightly linked in almost everybiochemical pathway in the plant. Carbon metabolites regulate genesinvolved in N acquisition and metabolism, and are known to affectgermination and the expression of photosynthetic genes (Coruzzi et al.(2001)) and hence growth. Early studies on nitrate reductase (NR) in1976 showed that NR activity could be affected by Glc/Suc (Crawford(1995); Daniel-Vedele et al. (1996)). Those observations were supportedby later experiments that showed sugars induce NR mRNA in dark-adapted,green seedlings (Cheng et al. (1992)). C and N may have antagonisticrelationships as signaling molecules; light induction of NR activity andmRNA levels can be mimicked by C metabolites and N-metabolites causerepression of NR induction in tobacco (Vincentz et al. (1992)). Generegulation by C/N (carbon-nitrogen balance) status has been demonstratedfor a number of N-metabolic genes (Stitt (1999)); Coruzzi et al.(2001)). Thus, a plant with altered C/N sensing may exhibit improvedgermination and/or growth under nitrogen-limiting conditions.

Water deficit is a major limitation of crop yields. In water-limitedenvironments, crop yield is a function of water use, water useefficiency (WUE; defined as aerial biomass yield/water use) and theharvest index (HI; the ratio of yield biomass to the total cumulativebiomass at harvest). WUE is a complex trait that involves water and CO₂uptake, transport and exchange at the leaf surface (transpiration).Improved WUE has been proposed as a criterion for yield improvementunder drought. Water deficit can also have adverse effects in the formof increased susceptibility to disease and pests, reduced plant growthand reproductive failure. Useful genes for expression especially duringwater deficit are genes which promote aspects of plant growth orfertility, genes which impart disease resistance, genes which impartpest resistance, and the like. These limitations can delay growth anddevelopment, reduce productivity, and in extreme cases, cause the plantto die. Enhanced tolerance to these stresses would lead to yieldincreases in conventional varieties and reduce yield variation in hybridvarieties.

Another factor affecting yield is the number of plants that can be grownper acre. For crop species, planting or population density varies from acrop to a crop, from one growing region to another, and from year toyear.

A plant's traits, including its biochemical, developmental, orphenotypic characteristics that enhance yield or tolerance to variousabiotic stresses, may be controlled through a number of cellularprocesses. One important way to manipulate that control is throughtranscription factors—proteins that influence the expression of aparticular gene or sets of genes. Transformed and transgenic plants thatcomprise cells having altered levels of at least one selectedtranscription factor, for example, possess advantageous or desirabletraits. Strategies for manipulating traits by altering a plant cell'stranscription factor content can therefore result in plants and cropswith commercially valuable properties.

SUMMARY OF THE INVENTION

An object of this invention is to provide plants which can express genesto increase yield of commercially significant plants, as well as toameliorate the adverse effects of water or nutrient deficit.

The present invention thus pertains to novel recombinantpolynucleotides, expression vectors, host plant cells and transgenicplants that contain them, and methods for producing the transgenicplants.

The recombinant polynucleotides may include any of the followingsequences:

-   -   (a) the nucleotide sequences found in the sequence listing;    -   (b) nucleotide sequences encoding polypeptides found in the        sequence listing;    -   (c) sequence variants that are at least 30% sequence identical        to any of the nucleotide sequences of (a) or (b);    -   (d) polypeptide sequences that are at least 30% identical, or at        least 32%, at least 33%, at least 36%, at least 40%, at least        45%, or at least 67% identical in their amino acid sequence to        any of SEQ ID NOs: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, or        24;    -   (e) orthologous and paralogous nucleotide sequences that are at        least 40% identical to any of the nucleotide sequences of (a) or        (b);    -   (e) nucleotide sequence that hybridize to any of the nucleotide        sequences of (a) or (b) under stringent conditions, which may        include, for example, hybridization with wash steps of 6×SSC and        65° C. for ten to thirty minutes per step; and    -   (f) polypeptides, and the nucleotide sequences that encode them,        having a B-box zinc finger conserved domain required for the        function of regulating transcription and altering a trait in a        transgenic plant, the conserved domain being at least about 56%        sequence identity, or at least about 58% sequence identity, or        at least about 60% sequence identity, or at least about 65%, or        at least about 67%, or at least about 70%, or at least about        75%, or at least about 76%, or at least about 77%, or at least        about 78%, or at least about 79%, or at least about 80%, or at        least about 81%, or at least about 82%, or at least about 83%,        or at least about 84%, or at least about 85%, or at least about        86%, or at least about 87%, or at least about 88%, or at least        about 89%, or at least about 90%, or at least about 91%, or at        least about 92%, or at least about 93%, or at least about 94%,        or at least about 95%, or at least about 96%, or at least about        97%, or at least about 98%, or at least about 99%, identical in        its amino acid residue sequence to the B-box zinc-finger (ZF)        conserved domains of SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, 18,        20, 22 or 24 (i.e., a polypeptide listed in the sequence        listing, or encoded by any of the above nucleotide sequences,        the conserved domains being represented by SEQ ID NOs: 45-56,        respectively). The conserved domains of the invention listed in        Table 1 comprise a domain required for the function of        regulating transcription and altering a trait in a transgenic        plant, said trait selected from the group consisting of        increasing yield, increasing height, altering C/N sensing,        increasing low nitrogen tolerance, increasing low phosphorus        tolerance, increasing tolerance to water deprivation, reducing        stomatal conductance, and increasing tolerance to a hyperosmotic        stress, as compared to the control plant. Additionally, the        polypeptides of the invention may comprise several signature        residues closer to the C-terminus than the B-box domain. These        residues comprise, in order from N to C termini:        -   W-X₄-G (SEQ ID NO: 62, where X represents any amino acid;            seen in FIG. 4D)        -   R-X₃-A-X₃-W (SEQ ID NO: 57, where X represents any amino            acid; seen in FIG. 4D) and        -   EGWXE (SEQ ID NO: 58; where X represents any amino acid;            seen in FIG. 4E)

The expression vectors, and hence the transgenic plants, of theinvention, comprise putative transcription factor polynucleotidessequences and, in particular, B-box zinc finger sequences. When any ofthese polypeptide of the invention is overexpressed in a plant, thepolypeptide confers at least one regulatory activity to the plant, whichin turn in manifested in a trait selected from the group consisting ofincreased yield, greater height, increased secondary rooting, greatercold tolerance, greater tolerance to water deprivation, reducingstomatal conductance, altered C/N sensing, increased low nitrogentolerance, increased low phosphorus tolerance, and increased toleranceto hyperosmotic stress as compared to the control plant.

The invention is also directed to transgenic seed produced by any of thetransgenic plants of the invention, and to methods for making thetransgenic plants and transgenic seed of the invention.

BRIEF DESCRIPTION OF THE SEQUENCE LISTING AND DRAWINGS

The Sequence Listing provides exemplary polynucleotide and polypeptidesequences of the invention. The traits associated with the use of thesequences are included in the Examples.

Incorporation of the Sequence Listing The Sequence Listing providesexemplary polynucleotide and polypeptide sequences. The copy of theSequence Listing, being submitted electronically with this patentapplication, provided under 37 CFR § 1.821-1.825, is a read-only memorycomputer-readable file in ASCII text format. The Sequence Listing isnamed “MBI-0076 DIV_ST25.txt”, the electronic file of the SequenceListing was created on Mar. 25, 2010, and is 85,416 bytes in size (84kilobytes in size as measured in MS-WINDOWS). The Sequence Listing isherein incorporated by reference in its entirety.

FIG. 1 shows a conservative estimate of phylogenetic relationships amongthe orders of flowering plants (modified from Soltis et al. (1997)).Those plants with a single cotyledon (monocots) are a monophyletic cladenested within at least two major lineages of dicots; the eudicots arefurther divided into rosids and asterids. Arabidopsis is a rosid eudicotclassified within the order Brassicales; rice is a member of the monocotorder Poales. FIG. 1 was adapted from Daly et al. (2001).

FIG. 2 shows a phylogenic dendogram depicting phylogenetic relationshipsof higher plant taxa, including clades containing tomato andArabidopsis; adapted from Ku et al. (2000); and Chase et al. (1993).

In FIG. 3, a phylogenetic tree and multiple sequence alignments of G1988and related full length proteins were constructed using ClustalW(CLUSTAL W Multiple Sequence Alignment Program version 1.83, 2003).ClustalW multiple alignment parameters were:

-   -   Gap Opening Penalty:10.00    -   Gap Extension Penalty:0.20    -   Delay divergent sequences:30%    -   DNA Transitions Weight:0.50    -   Protein weight matrix:Gonnet series    -   DNA weight matrix:IUB    -   Use negative matrix:OFF

A FastA formatted alignment was then used to generate a phylogenetictree in MEGA2 software (MEGA2 (http://www.megasoftware.net) using theneighbor joining algorithm and a p-distance model. A test of phylogenywas done via bootstrap with 1000 replications and Random Seed set todefault. Cut off values of the bootstrap tree were set to 50%.Closely-related homologs of G1988 are considered as being those proteinswithin the node of the tree below with a bootstrap value of 74, boundedby G4011 and G4009 (indicated by the box around these sequences). Theancestral sequence is represented by the node of the tree indicated bythe arrow in FIG. 3 having a bootstrap value of 74. Abbreviations:At—Arabidopsis thaliana; Ct—Citrus sinensis; Gm—Glycine max; Os—Oryzasativa; Pt—Populus trichocarpa; Zm—Zea mays.

FIGS. 4A-4F show a Clustal W alignment of the G1988 clade and relatedproteins. SEQ ID NOs: appear in parentheses after each Gene IDentifier(GID). Some members of the G1988 clade appear in the large boxes in eachof FIGS. 4A-4F. The highly conserved B-box zinc-finger (ZF) conserveddomain (B domain) is identified in FIGS. 4A-4B by the horizontal linebelow the alignment. Several characteristic or signature residues withincharacteristic motifs outside of and nearer to the C-terminus than theB-domain are indicated by the small dark triangles in FIGS. 4D and 4E.

FIG. 5 shows the average measure leaf SPAD chlorophyll level (SPAD or“Soil Plant Analysis Development”, measured with a Minolta SPAD-502 leafchlorophyll meter, vertical axis) measured in G1988 Arabidopsisoverexpressor lines (OE lines 10, 12 and 8-2; horizontal axis). Alsoshown are measurements for control plants (Cntl) for each of the threeexperimental lines. Plants were grown in 10 hr light, 0.1 mM NHNO₃,pre-bolting and were assayed 7.5 weeks after planting. The error barsrepresent the standard deviation of the mean. The three G1988 lines hadhigher chlorophyll content under low nitrogen conditions than thecontrols. Results obtained for lines 10 and 12 were significant atp<0.01.

FIG. 6 compares the effects on yield (vertical axis: change inpercentage yield) in various lines (horizontal axis) of transgenicsoybean plants overexpressing G1988 (35S::G1988) in year 2004 and 2005field trials. Data are averaged across multiple locations and aconsistent increase in yield, as compared with controls harboring anempty construct, was observed. In the 2005 analysis, G1988 significantlyincreased yield in 17 of 19 locations. If line 4, which unlike otherlines presented in this graph showed little or no expression of G1988 inleaf tissue, is removed from the analysis, the average yield increase in2005 was about 6.7%.

FIG. 7 shows experimental data obtained in 2005 with seed from aCalifornia field trial comparing a wild-type control soybean line andnumerous 35S::G1988 overexpressing lines of soybean plants. The dottedcurve represents the percentage of wild type germinating line. Thedashed curve above it represents a low overexpressor that ultimatelyproduced a small increase in yield over the control. The darker solidcurves above that of the low overexpressor represent other 35S::G1988overexpressors showing a higher degree of expression, ultimatelyproduced significantly higher yield, and improved germination in cold ascompared to the controls. Similar results were obtained with seedderived in the same year from a field trial conducted in Kansas and twofield trials in Illinois. These data demonstrated that G1988overexpression results in improved cold germination of soy.

FIG. 8 compares the overall germination of soybeans from the Californiafield trial. The germination of the control (dotted curve) was poor andit was noted that a high percentage of the seed were “hard seed”, astress-induced phenomenon that results in seeds that resist imbibitionunder standard conditions. The dashed curve below the dotted controlcurve represents the low overexpressor that appeared to have a similarpercentage of hard seed, that is, the same percentage of seed that didnot germinate at various time points, as the control. The darker solidcurves below the control and low overexpressor represent other35S::G1988 overexpressing lines that had a lower percentage of hard seedand eventually produced a higher yield than controls

FIG. 9 shows the mean number of pod-containing mainstem nodes, relativeto the parental control line represented by the “0” line, observed invarious lines of soybean plants overexpressing a number of sequences.The shaded bars denote G1988 overexpressing lines, which generallyproduced a significantly greater number of pod-bearing nodes than thecontrol plants.

FIG. 10 demonstrates how the increased soybean plant height that ischaracteristic of G1988 overexpression in short day periods (10 hourslight, 14 hours dark) is largely due to an increased in internode lengthin the upper portion of the plant. The most readily observabledifferences between a transgenic line and a control line were observedfor internodes 8 through 12. The differences in plant height betweenG1988 transgenic plants and controls were thus accentuated late in thegrowing season. The control untransformed line used in these experimentsis represented by the unshaded bars. The shaded bars show the internodelength (in centimeters) of overexpressor line 178.

FIG. 11 shows the results of a plant density field trial. As seen inthis figure, soybean plants overexpressing G1988 demonstrated anobservable yield increase across a range of plant densities, relative tocontrol plants that either did not overexpress G1988 (unfilled circles),or Line 217 transgenic plants that expressed G1988 to a lower degree(about 40% lower) than high yielding transgenic lines (filled circles).Plant stand count did not have large contribution to harvestable yield.Overexpressor line 178 plants are represented by unfilled triangles.Overexpressor line 189 plants are represented by filled triangles.Overexpressor line 209 plants are represented by unfilled squares.Overexpressor line 200 plants are represented by filled squares.Overexpressor line 213 plants are represented by asterisks.

FIG. 12 illustrates that the constitutive overexpression of G1988 (SEQID NO: 2) in soy plants promotes germination. Transgenic plantsoverexpressing G1988 that had been shown to increase yield in soy (line218, unfilled diamonds; and line 178, unfilled triangles) generallydemonstrated a percentage germination above line 217, which expressedG1988 to a lower degree than high yielding transgenic lines (filledcircles) and untransformed control plants (unfilled circles). Seeds inthese experiments were germinated in 1.0 μM gibberellic acid.

DETAILED DESCRIPTION

The present invention relates to polynucleotides and polypeptides formodifying phenotypes of plants, particularly those associated withincreased abiotic stress tolerance and increased yield with respect to acontrol plant (for example, a wild-type plant). Throughout thisdisclosure, various information sources are referred to and/or arespecifically incorporated. The information sources include scientificjournal articles, patent documents, textbooks, and World Wide Webbrowser-inactive page addresses. While the reference to theseinformation sources clearly indicates that they can be used by one ofskill in the art, each and every one of the information sources citedherein are specifically incorporated in their entirety, whether or not aspecific mention of “incorporation by reference” is noted. The contentsand teachings of each and every one of the information sources can berelied on and used to make and use embodiments of the invention.

As used herein and in the appended claims, the singular forms “a”, “an”,and “the” include the plural reference unless the context clearlydictates otherwise. Thus, for example, a reference to “a host cell”includes a plurality of such host cells, and a reference to “a stress”is a reference to one or more stresses and equivalents thereof known tothose skilled in the art, and so forth.

Definitions

“Polynucleotide” is a nucleic acid molecule comprising a plurality ofpolymerized nucleotides, e.g., at least about 15 consecutive polymerizednucleotides. A polynucleotide may be a nucleic acid, oligonucleotide,nucleotide, or any fragment thereof. In many instances, a polynucleotidecomprises a nucleotide sequence encoding a polypeptide (or protein) or adomain or fragment thereof. Additionally, the polynucleotide maycomprise a promoter, an intron, an enhancer region, a polyadenylationsite, a translation initiation site, 5′ or 3′ untranslated regions, areporter gene, a selectable marker, or the like. The polynucleotide canbe single-stranded or double-stranded DNA or RNA. The polynucleotideoptionally comprises modified bases or a modified backbone. Thepolynucleotide can be, e.g., genomic DNA or RNA, a transcript (such asan mRNA), a cDNA, a PCR product, a cloned DNA, a synthetic DNA or RNA,or the like. The polynucleotide can be combined with carbohydrate,lipids, protein, or other materials to perform a particular activitysuch as transformation or form a useful composition such as a peptidenucleic acid (PNA). The polynucleotide can comprise a sequence in eithersense or antisense orientations. “Oligonucleotide” is substantiallyequivalent to the terms amplimer, primer, oligomer, element, target, andprobe and is preferably single-stranded.

A “recombinant polynucleotide” is a polynucleotide that is not in itsnative state, e.g., the polynucleotide comprises a nucleotide sequencenot found in nature, or the polynucleotide is in a context other thanthat in which it is naturally found, e.g., separated from nucleotidesequences with which it typically is in proximity in nature, or adjacent(or contiguous with) nucleotide sequences with which it typically is notin proximity. For example, the sequence at issue can be cloned into avector, or otherwise recombined with one or more additional nucleicacid.

An “isolated polynucleotide” is a polynucleotide, whether naturallyoccurring or recombinant, that is present outside the cell in which itis typically found in nature, whether purified or not. Optionally, anisolated polynucleotide is subject to one or more enrichment orpurification procedures, e.g., cell lysis, extraction, centrifugation,precipitation, or the like.

“Gene” or “gene sequence” refers to the partial or complete codingsequence of a gene, its complement, and its 5′ or 3′ untranslatedregions. A gene is also a functional unit of inheritance, and inphysical terms is a particular segment or sequence of nucleotides alonga molecule of DNA (or RNA, in the case of RNA viruses) involved inproducing a polypeptide chain. The latter may be subjected to subsequentprocessing such as chemical modification or folding to obtain afunctional protein or polypeptide. A gene may be isolated, partiallyisolated, or found with an organism's genome. By way of example, atranscription factor gene encodes a transcription factor polypeptide,which may be functional or require processing to function as aninitiator of transcription.

Operationally, genes may be defined by the cis-trans test, a genetictest that determines whether two mutations occur in the same gene andthat may be used to determine the limits of the genetically active unit(Rieger et al. (1976)). A gene generally includes regions preceding(“leaders”; upstream) and following (“trailers”; downstream) the codingregion. A gene may also include intervening, non-coding sequences,referred to as “introns”, located between individual coding segments,referred to as “exons”. Most genes have an associated promoter region, aregulatory sequence 5′ of the transcription initiation codon (there aresome genes that do not have an identifiable promoter). The function of agene may also be regulated by enhancers, operators, and other regulatoryelements.

A “polypeptide” is an amino acid sequence comprising a plurality ofconsecutive polymerized amino acid residues e.g., at least about 15consecutive polymerized amino acid residues. In many instances, apolypeptide comprises a polymerized amino acid residue sequence that isa transcription factor or a domain or portion or fragment thereof.Additionally, the polypeptide may comprise: (i) a localization domain;(ii) an activation domain; (iii) a repression domain; (iv) anoligomerization domain; (v) a protein-protein interaction domain; (vi) aDNA-binding domain; or the like. The polypeptide optionally comprisesmodified amino acid residues, naturally occurring amino acid residuesnot encoded by a codon, non-naturally occurring amino acid residues.

“Protein” refers to an amino acid sequence, oligopeptide, peptide,polypeptide or portions thereof whether naturally occurring orsynthetic.

“Portion”, as used herein, refers to any part of a protein used for anypurpose, but especially for the screening of a library of moleculeswhich specifically bind to that portion or for the production ofantibodies.

A “recombinant polypeptide” is a polypeptide produced by translation ofa recombinant polynucleotide. A “synthetic polypeptide” is a polypeptidecreated by consecutive polymerization of isolated amino acid residuesusing methods well known in the art. An “isolated polypeptide,” whethera naturally occurring or a recombinant polypeptide, is more enriched in(or out of) a cell than the polypeptide in its natural state in awild-type cell, e.g., more than about 5% enriched, more than about 10%enriched, or more than about 20%, or more than about 50%, or more,enriched, i.e., alternatively denoted: 105%, 110%, 120%, 150% or more,enriched relative to wild type standardized at 100%. Such an enrichmentis not the result of a natural response of a wild-type plant.Alternatively, or additionally, the isolated polypeptide is separatedfrom other cellular components with which it is typically associated,e.g., by any of the various protein purification methods herein.

“Homology” refers to sequence similarity between a reference sequenceand at least a fragment of a newly sequenced clone insert or its encodedamino acid sequence.

“Identity” or “similarity” refers to sequence similarity between twopolynucleotide sequences or between two polypeptide sequences, withidentity being a more strict comparison. The phrases “percent identity”and “% identity” refer to the percentage of sequence similarity found ina comparison of two or more polynucleotide sequences or two or morepolypeptide sequences. “Sequence similarity” refers to the percentsimilarity in base pair sequence (as determined by any suitable method)between two or more polynucleotide sequences. Two or more sequences canbe anywhere from 0-100% similar, or any integer value therebetween.Identity or similarity can be determined by comparing a position in eachsequence that may be aligned for purposes of comparison. When a positionin the compared sequence is occupied by the same nucleotide base oramino acid, then the molecules are identical at that position. A degreeof similarity or identity between polynucleotide sequences is a functionof the number of identical, matching or corresponding nucleotides atpositions shared by the polynucleotide sequences. A degree of identityof polypeptide sequences is a function of the number of identical aminoacids at corresponding positions shared by the polypeptide sequences. Adegree of homology or similarity of polypeptide sequences is a functionof the number of amino acids at corresponding positions shared by thepolypeptide sequences.

“Alignment” refers to a number of nucleotide bases or amino acid residuesequences aligned by lengthwise comparison so that components in common(i.e., nucleotide bases or amino acid residues at correspondingpositions) may be visually and readily identified. The fraction orpercentage of components in common is related to the homology oridentity between the sequences. Alignments such as those of FIGS. 4A-4Fmay be used to identify conserved domains and relatedness within thesedomains. An alignment may suitably be determined by means of computerprograms known in the art, such as MACVECTOR software (1999) (Accelrys,Inc., San Diego, Calif.).

A “conserved domain” or “conserved region” as used herein refers to aregion in heterologous polynucleotide or polypeptide sequences wherethere is a relatively high degree of sequence identity between thedistinct sequences. A “B-box zinc finger“domain”, such as is found in apolypeptide member of B-box zinc finger family, is an example of aconserved domain. With respect to polynucleotides encoding presentlydisclosed polypeptides, a conserved domain is preferably at least ninebase pairs (bp) in length. A conserved domain with respect to presentlydisclosed polypeptides refers to a domain within a polypeptide familythat exhibits a higher degree of sequence homology, such as at leastabout 56% sequence identity, or at least about 58% sequence identity, orat least about 60% sequence identity, or at least about 65%, or at leastabout 67%, or at least about 70%, or at least about 75%, or at leastabout 76%, or at least about 77%, or at least about 78%, or at leastabout 79%, or at least about 80%, or at least about 81%, or at leastabout 82%, or at least about 83%, or at least about 84%, or at leastabout 85%, or at least about 86%, or at least about 87%, or at leastabout 88%, or at least about 89%, or at least about 90%, or at leastabout 91%, or at least about 92%, or at least about 93%, or at leastabout 94%, or at least about 95%, or at least about 96%, or at leastabout 97%, or at least about 98%, or at least about 99%, amino acidresidue sequence identity, to a conserved domain of a polypeptide of theinvention (e.g., any of SEQ ID NOs: 45-56). Sequences that possess orencode for conserved domains that meet these criteria of percentageidentity, and that have comparable biological activity to the presentpolypeptide sequences, thus being members of the G1988 cladepolypeptides, are encompassed by the invention. A fragment or domain canbe referred to as outside a conserved domain, outside a consensussequence, or outside a consensus DNA-binding site that is known to existor that exists for a particular polypeptide class, family, orsub-family. In this case, the fragment or domain will not include theexact amino acids of a consensus sequence or consensus DNA-binding siteof a transcription factor class, family or sub-family, or the exactamino acids of a particular transcription factor consensus sequence orconsensus DNA-binding site. Furthermore, a particular fragment, region,or domain of a polypeptide, or a polynucleotide encoding a polypeptide,can be “outside a conserved domain” if all the amino acids of thefragment, region, or domain fall outside of a defined conserveddomain(s) for a polypeptide or protein. Sequences having lesser degreesof identity but comparable biological activity are considered to beequivalents.

As one of ordinary skill in the art recognizes, conserved domains may beidentified as regions or domains of identity to a specific consensussequence (see, for example, Riechmann et al. (2000a, 2000b)). Thus, byusing alignment methods well known in the art, the conserved domains ofthe plant polypeptides, for example, for the B-box zinc finger proteins(Putterill et al. (1995)), may be determined.

The conserved domains for many of the polypeptide sequences of theinvention are listed in Table 1. Also, the polypeptides of Table 1 haveconserved domains specifically indicated by amino acid coordinate startand stop sites. A comparison of the regions of these polypeptides allowsone of skill in the art (see, for example, Reeves and Nissen (1990,1995)) to identify domains or conserved domains for any of thepolypeptides listed or referred to in this disclosure.

“Complementary” refers to the natural hydrogen bonding by base pairingbetween purines and pyrimidines. For example, the sequence A-C-G-T(5′->3′) forms hydrogen bonds with its complements A-C-G-T (5′->3′) orA-C-G-U (5′->3′). Two single-stranded molecules may be consideredpartially complementary, if only some of the nucleotides bond, or“completely complementary” if all of the nucleotides bond. The degree ofcomplementarity between nucleic acid strands affects the efficiency andstrength of hybridization and amplification reactions. “Fullycomplementary” refers to the case where bonding occurs between everybase pair and its complement in a pair of sequences, and the twosequences have the same number of nucleotides.

The terms “highly stringent” or “highly stringent condition” refer toconditions that permit hybridization of DNA strands whose sequences arehighly complementary, wherein these same conditions excludehybridization of significantly mismatched DNAs. Polynucleotide sequencescapable of hybridizing under stringent conditions with thepolynucleotides of the present invention may be, for example, variantsof the disclosed polynucleotide sequences, including allelic or splicevariants, or sequences that encode orthologs or paralogs of presentlydisclosed polypeptides. Nucleic acid hybridization methods are disclosedin detail by Kashima et al. (1985), Sambrook et al. (1989), and byHaymes et al. (1985), which references are incorporated herein byreference.

In general, stringency is determined by the temperature, ionic strength,and concentration of denaturing agents (e.g., formamide) used in ahybridization and washing procedure (for a more detailed description ofestablishing and determining stringency, see the section “IdentifyingPolynucleotides or Nucleic Acids by Hybridization”, below). The degreeto which two nucleic acids hybridize under various conditions ofstringency is correlated with the extent of their similarity. Thus,similar nucleic acid sequences from a variety of sources, such as withina plant's genome (as in the case of paralogs) or from another plant (asin the case of orthologs) that may perform similar functions can beisolated on the basis of their ability to hybridize with known relatedpolynucleotide sequences. Numerous variations are possible in theconditions and means by which nucleic acid hybridization can beperformed to isolate related polynucleotide sequences having similarityto sequences known in the art and are not limited to those explicitlydisclosed herein. Such an approach may be used to isolate polynucleotidesequences having various degrees of similarity with disclosedpolynucleotide sequences, such as, for example, encoded transcriptionfactors having 56% or greater identity with the conserved domains ofdisclosed sequences.

The terms “paralog” and “ortholog” are defined below in the sectionentitled “Orthologs and Paralogs”. In brief, orthologs and paralogs areevolutionarily related genes that have similar sequences and functions.Orthologs are structurally related genes in different species that arederived by a speciation event. Paralogs are structurally related geneswithin a single species that are derived by a duplication event.

The term “equivalog” describes members of a set of homologous proteinsthat are conserved with respect to function since their last commonancestor. Related proteins are grouped into equivalog families, andotherwise into protein families with other hierarchically definedhomology types. This definition is provided at the Institute for GenomicResearch (TIGR) World Wide Web (www) website, “tigr.org “under theheading “Terms associated with TIGRFAMs”.

In general, the term “variant” refers to molecules with somedifferences, generated synthetically or naturally, in their base oramino acid sequences as compared to a reference (native) polynucleotideor polypeptide, respectively. These differences include substitutions,insertions, deletions or any desired combinations of such changes in anative polynucleotide of amino acid sequence.

With regard to polynucleotide variants, differences between presentlydisclosed polynucleotides and polynucleotide variants are limited sothat the nucleotide sequences of the former and the latter are closelysimilar overall and, in many regions, identical. Due to the degeneracyof the genetic code, differences between the former and latternucleotide sequences may be silent (i.e., the amino acids encoded by thepolynucleotide are the same, and the variant polynucleotide sequenceencodes the same amino acid sequence as the presently disclosedpolynucleotide. Variant nucleotide sequences may encode different aminoacid sequences, in which case such nucleotide differences will result inamino acid substitutions, additions, deletions, insertions, truncationsor fusions with respect to the similar disclosed polynucleotidesequences. These variations may result in polynucleotide variantsencoding polypeptides that share at least one functional characteristic.The degeneracy of the genetic code also dictates that many differentvariant polynucleotides can encode identical and/or substantiallysimilar polypeptides in addition to those sequences illustrated in theSequence Listing.

Also within the scope of the invention is a variant of a nucleic acidlisted in the Sequence Listing, that is, one having a sequence thatdiffers from the one of the polynucleotide sequences in the SequenceListing, or a complementary sequence, that encodes a functionallyequivalent polypeptide (i.e., a polypeptide having some degree ofequivalent or similar biological activity) but differs in sequence fromthe sequence in the Sequence Listing, due to degeneracy in the geneticcode. Included within this definition are polymorphisms that may or maynot be readily detectable using a particular oligonucleotide probe ofthe polynucleotide encoding polypeptide, and improper or unexpectedhybridization to allelic variants, with a locus other than the normalchromosomal locus for the polynucleotide sequence encoding polypeptide.

“Allelic variant” or “polynucleotide allelic variant” refers to any oftwo or more alternative forms of a gene occupying the same chromosomallocus. Allelic variation arises naturally through mutation, and mayresult in phenotypic polymorphism within populations. Gene mutations maybe “silent” or may encode polypeptides having altered amino acidsequence. “Allelic variant” and “polypeptide allelic variant” may alsobe used with respect to polypeptides, and in this case the terms referto a polypeptide encoded by an allelic variant of a gene.

“Splice variant” or “polynucleotide splice variant” as used hereinrefers to alternative forms of RNA transcribed from a gene. Splicevariation naturally occurs as a result of alternative sites beingspliced within a single transcribed RNA molecule or between separatelytranscribed RNA molecules, and may result in several different forms ofmRNA transcribed from the same gene. Thus, splice variants may encodepolypeptides having different amino acid sequences, which may or may nothave similar functions in the organism. “Splice variant” or “polypeptidesplice variant” may also refer to a polypeptide encoded by a splicevariant of a transcribed mRNA.

As used herein, “polynucleotide variants” may also refer topolynucleotide sequences that encode paralogs and orthologs of thepresently disclosed polypeptide sequences. “Polypeptide variants” mayrefer to polypeptide sequences that are paralogs and orthologs of thepresently disclosed polypeptide sequences.

Differences between presently disclosed polypeptides and polypeptidevariants are limited so that the sequences of the former and the latterare closely similar overall and, in many regions, identical. Presentlydisclosed polypeptide sequences and similar polypeptide variants maydiffer in amino acid sequence by one or more substitutions, additions,deletions, fusions and truncations, which may be present in anycombination. These differences may produce silent changes and result ina functionally equivalent polypeptides. Thus, it will be readilyappreciated by those of skill in the art, that any of a variety ofpolynucleotide sequences is capable of encoding the polypeptides andhomolog polypeptides of the invention. A polypeptide sequence variantmay have “conservative” changes, wherein a substituted amino acid hassimilar structural or chemical properties. Deliberate amino acidsubstitutions may thus be made on the basis of similarity in polarity,charge, solubility, hydrophobicity, hydrophilicity, and/or theamphipathic nature of the residues, as long as a significant amount ofthe functional or biological activity of the polypeptide is retained.For example, negatively charged amino acids may include aspartic acidand glutamic acid, positively charged amino acids may include lysine andarginine, and amino acids with uncharged polar head groups havingsimilar hydrophilicity values may include leucine, isoleucine, andvaline; glycine and alanine; asparagine and glutamine; serine andthreonine; and phenylalanine and tyrosine. More rarely, a variant mayhave “non-conservative” changes, e.g., replacement of a glycine with atryptophan. Similar minor variations may also include amino aciddeletions or insertions, or both. Related polypeptides may comprise, forexample, additions and/or deletions of one or more N-linked or O-linkedglycosylation sites, or an addition and/or a deletion of one or morecysteine residues. Guidance in determining which and how many amino acidresidues may be substituted, inserted or deleted without abolishingfunctional or biological activity may be found using computer programswell known in the art, for example, DNASTAR software (see U.S. Pat. No.5,840,544).

“Fragment”, with respect to a polynucleotide, refers to a clone or anypart of a polynucleotide molecule that retains a usable, functionalcharacteristic. Useful fragments include oligonucleotides andpolynucleotides that may be used in hybridization or amplificationtechnologies or in the regulation of replication, transcription ortranslation. A “polynucleotide fragment” refers to any subsequence of apolynucleotide, typically, of at least about 9 consecutive nucleotides,preferably at least about 30 nucleotides, more preferably at least about50 nucleotides, of any of the sequences provided herein. Exemplarypolynucleotide fragments are the first sixty consecutive nucleotides ofthe polynucleotides listed in the Sequence Listing. Exemplary fragmentsalso include fragments that comprise a region that encodes an conserveddomain of a polypeptide. Exemplary fragments also include fragments thatcomprise a conserved domain of a polypeptide. Exemplary fragmentsinclude fragments that comprise an conserved domain of a polypeptide,for example, amino acid residues 5-50 of G1988 (SEQ ID NO: 2), aminoacid residues 6-51 of G4004 (SEQ ID NO: 4) or amino acid residues 6-51of G4005 (SEQ ID NO: 6).

Fragments may also include subsequences of polypeptides and proteinmolecules, or a subsequence of the polypeptide. Fragments may have usesin that they may have antigenic potential. In some cases, the fragmentor domain is a subsequence of the polypeptide which performs at leastone biological function of the intact polypeptide in substantially thesame manner, or to a similar extent, as does the intact polypeptide. Forexample, a polypeptide fragment can comprise a recognizable structuralmotif or functional domain such as a DNA-binding site or domain thatbinds to a DNA promoter region, an activation domain, or a domain forprotein-protein interactions, and may initiate transcription. Fragmentscan vary in size from as few as 3 amino acid residues to the full lengthof the intact polypeptide, but are preferably at least about 30 aminoacid residues in length and more preferably at least about 60 amino acidresidues in length.

The invention also encompasses production of DNA sequences that encodepolypeptides and derivatives, or fragments thereof, entirely bysynthetic chemistry. After production, the synthetic sequence may beinserted into any of the many available expression vectors and cellsystems using reagents well known in the art. Moreover, syntheticchemistry may be used to introduce mutations into a sequence encodingpolypeptides or any fragment thereof.

“Derivative” refers to the chemical modification of a nucleic acidmolecule or amino acid sequence. Chemical modifications can includereplacement of hydrogen by an alkyl, acyl, or amino group orglycosylation, pegylation, or any similar process that retains orenhances biological activity or lifespan of the molecule or sequence.

The term “plant” includes whole plants, shoot vegetativeorgans/structures (for example, leaves, stems and tubers), roots,flowers and floral organs/structures (for example, bracts, sepals,petals, stamens, carpels, anthers and ovules), seed (including embryo,endosperm, and seed coat) and fruit (the mature ovary), plant tissue(for example, vascular tissue, ground tissue, and the like) and cells(for example, guard cells, egg cells, and the like), and progeny ofsame. The class of plants that can be used in the method of theinvention is generally as broad as the class of higher and lower plantsamenable to transformation techniques, including angiosperms(monocotyledonous and dicotyledonous plants), gymnosperms, ferns,horsetails, psilophytes, lycophytes, bryophytes, and multicellular algae(see for example, FIG. 1, adapted from Daly et al. (2001), FIG. 2,adapted from Ku et al. (2000); and see also Tudge (2000).

A “control plant” as used in the present invention refers to a plantcell, seed, plant component, plant tissue, plant organ or whole plantused to compare against transgenic or genetically modified plant for thepurpose of identifying an enhanced phenotype in the transgenic orgenetically modified plant. A control plant may in some cases be atransgenic plant line that comprises an empty vector or marker gene, butdoes not contain the recombinant polynucleotide of the present inventionthat is expressed in the transgenic or genetically modified plant beingevaluated. In general, a control plant is a plant of the same line orvariety as the transgenic or genetically modified plant being tested. Asuitable control plant would include a genetically unaltered ornon-transgenic plant of the parental line used to generate a transgenicplant herein.

A “transgenic plant” refers to a plant that contains genetic materialnot found in a wild-type plant of the same species, variety or cultivar.The genetic material may include a transgene, an insertional mutagenesisevent (such as by transposon or T-DNA insertional mutagenesis), anactivation tagging sequence, a mutated sequence, a homologousrecombination event or a sequence modified by chimeraplasty. Typically,the foreign genetic material has been introduced into the plant by humanmanipulation, but any method can be used as one of skill in the artrecognizes.

A transgenic plant may contain an expression vector or cassette. Theexpression cassette typically comprises a polypeptide-encoding sequenceoperably linked (i.e., under regulatory control of) to appropriateinducible or constitutive regulatory sequences that allow for thecontrolled expression of polypeptide. The expression cassette can beintroduced into a plant by transformation or by breeding aftertransformation of a parent plant. A plant refers to a whole plant aswell as to a plant part, such as seed, fruit, leaf, or root, planttissue, plant cells or any other plant material, e.g., a plant explant,as well as to progeny thereof, and to in vitro systems that mimicbiochemical or cellular components or processes in a cell.

“Wild type” or “wild-type”, as used herein, refers to a plant cell,seed, plant component, plant tissue, plant organ or whole plant that hasnot been genetically modified or treated in an experimental sense.Wild-type cells, seed, components, tissue, organs or whole plants may beused as controls to compare levels of expression and the extent andnature of trait modification with cells, tissue or plants of the samespecies in which a polypeptide's expression is altered, e.g., in that ithas been knocked out, overexpressed, or ectopically expressed.

A “trait” refers to a physiological, morphological, biochemical, orphysical characteristic of a plant or particular plant material or cell.In some instances, this characteristic is visible to the human eye, suchas seed or plant size, or can be measured by biochemical techniques,such as detecting the protein, starch, or oil content of seed or leaves,or by observation of a metabolic or physiological process, e.g. bymeasuring tolerance to water deprivation or particular salt or sugarconcentrations, or by the observation of the expression level of a geneor genes, e.g., by employing Northern analysis, RT-PCR, microarray geneexpression assays, or reporter gene expression systems, or byagricultural observations such as hyperosmotic stress tolerance oryield. Any technique can be used to measure the amount of, comparativelevel of, or difference in any selected chemical compound ormacromolecule in the transgenic plants, however.

“Trait modification” refers to a detectable difference in acharacteristic in a plant ectopically expressing a polynucleotide orpolypeptide of the present invention relative to a plant not doing so,such as a wild-type plant. In some cases, the trait modification can beevaluated quantitatively. For example, the trait modification can entailat least about a 2% increase or decrease, or an even greater difference,in an observed trait as compared with a control or wild-type plant. Itis known that there can be a natural variation in the modified trait.Therefore, the trait modification observed entails a change of thenormal distribution and magnitude of the trait in the plants as comparedto control or wild-type plants.

When two or more plants have “similar morphologies”, “substantiallysimilar morphologies”, “a morphology that is substantially similar”, orare “morphologically similar”, the plants have comparable forms orappearances, including analogous features such as overall dimensions,height, width, mass, root mass, shape, glossiness, color, stem diameter,leaf size, leaf dimension, leaf density, internode distance, branching,root branching, number and form of inflorescences, and other macroscopiccharacteristics, and the individual plants are not readilydistinguishable based on morphological characteristics alone.

“Modulates” refers to a change in activity (biological, chemical, orimmunological) or lifespan resulting from specific binding between amolecule and either a nucleic acid molecule or a protein.

The term “transcript profile” refers to the expression levels of a setof genes in a cell in a particular state, particularly by comparisonwith the expression levels of that same set of genes in a cell of thesame type in a reference state. For example, the transcript profile of aparticular polypeptide in a suspension cell is the expression levels ofa set of genes in a cell knocking out or overexpressing that polypeptidecompared with the expression levels of that same set of genes in asuspension cell that has normal levels of that polypeptide. Thetranscript profile can be presented as a list of those genes whoseexpression level is significantly different between the two treatments,and the difference ratios. Differences and similarities betweenexpression levels may also be evaluated and calculated using statisticaland clustering methods.

With regard to gene knockouts as used herein, the term “knockout” refersto a plant or plant cell having a disruption in at least one gene in theplant or cell, where the disruption results in a reduced expression oractivity of the polypeptide encoded by that gene compared to a controlcell. The knockout can be the result of, for example, genomicdisruptions, including transposons, tilling, and homologousrecombination, antisense constructs, sense constructs, RNA silencingconstructs, or RNA interference. A T-DNA insertion within a gene is anexample of a genotypic alteration that may abolish expression of thatgene.

“Ectopic expression or altered expression” in reference to apolynucleotide indicates that the pattern of expression in, e.g., atransgenic plant or plant tissue, is different from the expressionpattern in a wild-type plant or a reference plant of the same species.The pattern of expression may also be compared with a referenceexpression pattern in a wild-type plant of the same species. Forexample, the polynucleotide or polypeptide is expressed in a cell ortissue type other than a cell or tissue type in which the sequence isexpressed in the wild-type plant, or by expression at a time other thanat the time the sequence is expressed in the wild-type plant, or by aresponse to different inducible agents, such as hormones orenvironmental signals, or at different expression levels (either higheror lower) compared with those found in a wild-type plant. The term alsorefers to altered expression patterns that are produced by lowering thelevels of expression to below the detection level or completelyabolishing expression. The resulting expression pattern can be transientor stable, constitutive or inducible. In reference to a polypeptide, theterm “ectopic expression or altered expression” further may relate toaltered activity levels resulting from the interactions of thepolypeptides with exogenous or endogenous modulators or frominteractions with factors or as a result of the chemical modification ofthe polypeptides.

The term “overexpression” as used herein refers to a greater expressionlevel of a gene in a plant, plant cell or plant tissue, compared toexpression in a wild-type plant, cell or tissue, at any developmental ortemporal stage for the gene. Overexpression can occur when, for example,the genes encoding one or more polypeptides are under the control of astrong promoter (e.g., the cauliflower mosaic virus 35S transcriptioninitiation region). Overexpression may also under the control of aninducible or tissue specific promoter. Thus, overexpression may occurthroughout a plant, in specific tissues of the plant, or in the presenceor absence of particular environmental signals, depending on thepromoter used.

Overexpression may take place in plant cells normally lacking expressionof polypeptides functionally equivalent or identical to the presentpolypeptides. Overexpression may also occur in plant cells whereendogenous expression of the present polypeptides or functionallyequivalent molecules normally occurs, but such normal expression is at alower level. Overexpression thus results in a greater than normalproduction, or “overproduction” of the polypeptide in the plant, cell ortissue.

The term “transcription regulating region” refers to a DNA regulatorysequence that regulates expression of one or more genes in a plant whena transcription factor having one or more specific binding domains bindsto the DNA regulatory sequence. Transcription factors possess anconserved domain. The transcription factors also comprise an amino acidsubsequence that forms a transcription activation domain that regulatesexpression of one or more abiotic stress tolerance genes in a plant whenthe transcription factor binds to the regulating region.

“Yield” or “plant yield” refers to increased plant growth, increasedcrop growth, increased biomass, and/or increased plant productproduction, and is dependent to some extent on temperature, plant size,organ size, planting density, light, water and nutrient availability,and how the plant copes with various stresses, such as throughtemperature acclimation and water or nutrient use efficiency.

“Planting density” refers to the number of plants that can be grown peracre. For crop species, planting or population density varies from acrop to a crop, from one growing region to another, and from year toyear. Using corn as an example, the average prevailing density in 2000was in the range of 20,000-25,000 plants per acre in Missouri, USA. Adesirable higher population density (a measure of yield) would be atleast 22,000 plants per acre, and a more desirable higher populationdensity would be at least 28,000 plants per acre, more preferably atleast 34,000 plants per acre, and most preferably at least 40,000 plantsper acre. The average prevailing densities per acre of a few otherexamples of crop plants in the USA in the year 2000 were: wheat1,000,000-1,500,000; rice 650,000-900,000; soybean 150,000-200,000,canola 260,000-350,000, sunflower 17,000-23,000 and cotton 28,000-55,000plants per acre (Cheikh et al. (2003) U.S. Patent Application No.20030101479). A desirable higher population density for each of theseexamples, as well as other valuable species of plants, would be at least10% higher than the average prevailing density or yield.

Description of the Specific Embodiments

Transcription Factors Modify Expression of Endogenous Genes

A transcription factor may include, but is not limited to, anypolypeptide that can activate or repress transcription of a single geneor a number of genes. As one of ordinary skill in the art recognizes,transcription factors can be identified by the presence of a region ordomain of structural similarity or identity to a specific consensussequence or the presence of a specific consensus DNA-binding motif (see,for example, Riechmann et al. (2000a)). The plant transcription factorsof the present invention belong to the B-box zinc finger family(Putterill et al. (1995)) and are putative transcription factors.

Generally, transcription factors are involved in cell differentiationand proliferation and the regulation of growth. Accordingly, one skilledin the art would recognize that by expressing the present sequences in aplant, one may change the expression of autologous genes or induce theexpression of introduced genes. By affecting the expression of similarautologous sequences in a plant that have the biological activity of thepresent sequences, or by introducing the present sequences into a plant,one may alter a plant's phenotype to one with improved traits related toosmotic stresses. The sequences of the invention may also be used totransform a plant and introduce desirable traits not found in thewild-type cultivar or strain. Plants may then be selected for those thatproduce the most desirable degree of over- or under-expression of targetgenes of interest and coincident trait improvement.

The sequences of the present invention may be from any species,particularly plant species, in a naturally occurring form or from anysource whether natural, synthetic, semi-synthetic or recombinant. Thesequences of the invention may also include fragments of the presentamino acid sequences. Where “amino acid sequence” is recited to refer toan amino acid sequence of a naturally occurring protein molecule, “aminoacid sequence” and like terms are not meant to limit the amino acidsequence to the complete native amino acid sequence associated with therecited protein molecule.

In addition to methods for modifying a plant phenotype by employing oneor more polynucleotides and polypeptides of the invention describedherein, the polynucleotides and polypeptides of the invention have avariety of additional uses. These uses include their use in therecombinant production (i.e., expression) of proteins; as regulators ofplant gene expression, as diagnostic probes for the presence ofcomplementary or partially complementary nucleic acids (including fordetection of natural coding nucleic acids); as substrates for furtherreactions, e.g., mutation reactions, PCR reactions, or the like; assubstrates for cloning e.g., including digestion or ligation reactions;and for identifying exogenous or endogenous modulators of thetranscription factors. The polynucleotide can be, e.g., genomic DNA orRNA, a transcript (such as an mRNA), a cDNA, a PCR product, a clonedDNA, a synthetic DNA or RNA, or the like. The polynucleotide cancomprise a sequence in either sense or antisense orientations.

Expression of genes that encode polypeptides that modify expression ofendogenous genes, polynucleotides, and proteins are well known in theart. In addition, transgenic plants comprising isolated polynucleotidesencoding transcription factors may also modify expression of endogenousgenes, polynucleotides, and proteins. Examples include Peng et al.(1997) and Peng et al. (1999). In addition, many others havedemonstrated that an Arabidopsis transcription factor expressed in anexogenous plant species elicits the same or very similar phenotypicresponse. See, for example, Fu et al. (2001); Nandi et al. (2000);Coupland (1995); and Weigel and Nilsson (1995)).

In another example, Mandel et al. (1992b), and Suzuki et al. (2001),teach that a transcription factor expressed in another plant specieselicits the same or very similar phenotypic response of the endogenoussequence, as often predicted in earlier studies of Arabidopsistranscription factors in Arabidopsis (see Mandel et al. (1992a); Suzukiet al. (2001)). Other examples include Miller et al. (2001); Kim et al.(2001); Kyozuka and Shimamoto (2002); Boss and Thomas (2002); He et al.(2000); and Robson et al. (2001).

In yet another example, Gilmour et al. (1998) teach an Arabidopsis AP2transcription factor, CBF1, which, when overexpressed in transgenicplants, increases plant freezing tolerance. Jaglo et al. (2001) furtheridentified sequences in Brassica napus which encode CBF-like genes andthat transcripts for these genes accumulated rapidly in response to lowtemperature. Transcripts encoding CBF-like proteins were also found toaccumulate rapidly in response to low temperature in wheat, as well asin tomato. An alignment of the CBF proteins from Arabidopsis, B. napus,wheat, rye, and tomato revealed the presence of conserved consecutiveamino acid residues, PKK/RPAGRxKFxETRHP (SEQ ID NO: 69) and DSAWR (SEQID NO: 70), which bracket the AP2/EREBP DNA binding domains of theproteins and distinguish them from other members of the AP2/EREBPprotein family. (Jaglo et al. (2001))

Transcription factors mediate cellular responses and control traitsthrough altered expression of genes containing cis-acting nucleotidesequences that are targets of the introduced transcription factor. It iswell appreciated in the art that the effect of a transcription factor oncellular responses or a cellular trait is determined by the particulargenes whose expression is either directly or indirectly (e.g., by acascade of transcription factor binding events and transcriptionalchanges) altered by transcription factor binding. In a global analysisof transcription comparing a standard condition with one in which atranscription factor is overexpressed, the resulting transcript profileassociated with transcription factor overexpression is related to thetrait or cellular process controlled by that transcription factor. Forexample, the PAP2 gene and other genes in the MYB family have been shownto control anthocyanin biosynthesis through regulation of the expressionof genes known to be involved in the anthocyanin biosynthetic pathway(Bruce et al. (2000); and Borevitz et al. (2000)). Further, globaltranscript profiles have been used successfully as diagnostic tools forspecific cellular states (e.g., cancerous vs. non-cancerous;Bhattacharjee et al. (2001); and Xu et al. (2001)). Consequently, it isevident to one skilled in the art that similarity of transcript profileupon overexpression of different transcription factors would indicatesimilarity of transcription factor function.

Polypeptides and Polynucleotides of the Invention

The present invention includes putative transcription factors (TFs), andisolated or recombinant polynucleotides encoding the polypeptides, ornovel sequence variant polypeptides or polynucleotides encoding novelvariants of polypeptides derived from the specific sequences provided inthe Sequence Listing; the recombinant polynucleotides of the inventionmay be incorporated in expression vectors for the purpose of producingtransformed plants. Also provided are methods for modifying yield from aplant by modifying the mass, size or number of plant organs or seed of aplant by controlling a number of cellular processes, and for increasinga plant's resistance to abiotic stresses. These methods are based on theability to alter the expression of critical regulatory molecules thatmay be conserved between diverse plant species. Related conservedregulatory molecules may be originally discovered in a model system suchas Arabidopsis and homologous, functional molecules then discovered inother plant species. The latter may then be used to confer increasedyield or abiotic stress tolerance in diverse plant species.

Exemplary polynucleotides encoding the polypeptides of the inventionwere identified in the Arabidopsis thaliana GenBank database usingpublicly available sequence analysis programs and parameters. Sequencesinitially identified were then further characterized to identifysequences comprising specified sequence strings corresponding tosequence motifs present in families of known polypeptides. In addition,further exemplary polynucleotides encoding the polypeptides of theinvention were identified in the plant GenBank database using publiclyavailable sequence analysis programs and parameters. Sequences initiallyidentified were then further characterized to identify sequencescomprising specified sequence strings corresponding to sequence motifspresent in families of known polypeptides.

Additional polynucleotides of the invention were identified by screeningArabidopsis thaliana and/or other plant cDNA libraries with probescorresponding to known polypeptides under low stringency hybridizationconditions. Additional sequences, including full length codingsequences, were subsequently recovered by the rapid amplification ofcDNA ends (RACE) procedure using a commercially available kit accordingto the manufacturer's instructions. Where necessary, multiple rounds ofRACE are performed to isolate 5′ and 3′ ends. The full-length cDNA wasthen recovered by a routine end-to-end polymerase chain reaction (PCR)using primers specific to the isolated 5′ and 3′ ends. Exemplarysequences are provided in the Sequence Listing.

Many of the sequences in the Sequence Listing, derived from diverseplant species, have been ectopically expressed in overexpressor plants.The changes in the characteristic(s) or trait(s) of the plants were thenobserved and found to confer increased yield and/or increased abioticstress tolerance. Therefore, the polynucleotides and polypeptides can beused to improve desirable characteristics of plants.

The polynucleotides of the invention were also ectopically expressed inoverexpressor plant cells and the changes in the expression levels of anumber of genes, polynucleotides, and/or proteins of the plant cellsobserved. Therefore, the polynucleotides and polypeptides can be used tochange expression levels of genes, polynucleotides, and/or proteins ofplants or plant cells.

The data presented herein represent the results obtained in experimentswith polynucleotides and polypeptides that may be expressed in plantsfor the purpose of reducing yield losses that arise from biotic andabiotic stress.

Background Information for G1988, the G1988 Clade, and Related Sequences

G1988 belongs to the CONSTANS-like family of zinc finger proteins, whichwas defined based on a Zn-finger domain known as the B-box. The B-boxhas homology to a protein-protein interaction domain found in animaltranscription factors (Robson et al., 2001; Borden, 1998; Torok andEtkin, 2001) and the B-domain of G1988 and its close homolog clademembers functions in the same protein-protein interaction capacity. TheCONSTANS-like proteins contain one or two N-terminal B-box motifs (theG1988 clade contains a single N-terminal B-box domain). G1988 and itshomologs from other species share conserved C-terminal motifs thatdefine a clear clade that is distinct from other B-box proteins, andgenerally contain the signature residues identified by the triangles inFIGS. 4D and 4E, and by SEQ ID NOs: 62, 57, and 58. G1988 is expressedin many tissues. G1988 and its homologs are diurnally regulated

As disclosed below in the Examples, constitutive expression of G1988 inArabidopsis modulates diverse plant growth processes, includingelongation of hypocotyls, extended petioles and upheld leaves, earlyflowering; enhanced root and/or shoot growth in phosphate-limited media;more secondary roots on control media, enhanced growth and reducedanthocyanin in low nitrogen/high sucrose media supplemented withglutamine, enhanced root growth on salt-containing media, and enhancedroot growth on polyethylene glycol-containing media, as compared tocontrol plants. G1988 overexpression in soybean plants has been shown toresult in a statistically significant increase in yield in field trials(see FIG. 6 and Examples presented below) as compared to parental linecontrols.

The G1988 clade includes a number of sequences descended from a commonancestral sequence, as shown in the phylogenetic tree seen in FIG. 3.The ancestral sequence is represented by the node of the tree indicatedby the arrow in FIG. 3 having a bootstrap value of 74. Examples of clademembers include those sequences within the box and bounded by G4011 andG4009 in FIG. 3. Polypeptide members of the G1988 clade examined todate, including G1988 and phylogenetically-related sequences fromdiverse species, comprise several characteristic structural features,including a highly conserved B-domain, indicated in FIGS. 4A and 4B, andseveral characteristic or signature residues outside of and nearer tothe C-terminus than the B-domain. Signature residues are indicated bythe small dark triangles in FIGS. 4D and 4E. These residues comprise, inorder from N to C termini:

-   -   W-X₄-G (SEQ ID NO: 62, where X represents any amino acid; seen        in FIG. 4D)    -   R-X₃-A-X₃-W (SEQ ID NO: 57, where X represents any amino acid;        seen in FIG. 4D) followed by:    -   EGWXE (SEQ ID NO: 58; where X represents any amino acid; seen in        FIG. 4E).

Thus, a G1988 clade sequence may be defined as having a highly conservedB-domain at least 56% identical in its amino acid sequence to SEQ ID NO:45. G1988 clade members examined thus far may be further defined byhaving amino acid residues characterized by a tryptophan residue and aglycine residue at the positions corresponding to the first and fifthresidues shown in FIG. 4D nearer the C-terminus than said B-domain,and/or by having SEQ ID NO: 57 nearer the C-terminus than saidtryptophan residue, and/or by having SEQ ID NO: 58 nearer the C-terminusthan SEQ ID NO: 57.

It is likely that the ectopic expression of G1988 product can affectlight signaling, or downstream hormonal pathways. Based upon theobservations described above, G1988 appears to be involved inphotomorphogenesis and plant growth and development. Hence, itsoverexpression may improve plant vigor, thus explaining the yieldenhancements seen in 35S::G1988 soybean plants as noted below.

A number of sequences have been found in other plant species that areclosely-related to G1988. Table 1 shows a number of polypeptides of theinvention and includes the SEQ ID NO: (Column 1), the species from whichthe sequence was derived and the Gene Identifier (“GID”; Column 2), thepercent identity of the polypeptide in Column 1 to the full length G1988polypeptide, SEQ ID NO: 1, as determined by a BLASTp analysis with awordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoringmatrix Henikoff & Henikoff (1989, 1991) (Column 3), the amino acidresidue coordinates for the conserved B-box ZF domains, in amino acidcoordinates beginning at the n-terminus, of each of the sequences(Column 4), the conserved B-box ZF domain sequences of the respectivepolypeptides (Column 5); the SEQ ID NO: of each of the B-box ZF domains(Column 6), and the percentage identity of the conserved domain inColumn 5 to the conserved domain of the Arabidopsis G1988 sequence, SEQID NO: 45 (Column 7).

TABLE 1 Conserved domains of G1988 and closely related sequencesColumn 7 Percent identity of Column 3 B-box ZF Percent Column 4 Column 6domain in Column 1 identity of B-box Z SEQ ID Column 5 to PolypeptideColumn 2 polypeptide domain in Column 5 NO: of conserved SEQ ID Species/in Column 1 amino acid B-box ZF B-box ZF domain of NO: GID No. to G1988coordinates domain domain G1988 2 At/G1988 100%  5-50 CELCGAEADLH 45100%  CAADSAFLCRS CDAKFHASNFL FARHFRRVICP NC 18 Zm/G4297 30% 14-55 CELCGGAAAVH 53 70% CAADSAFLCPR CDAKVHGANFL ASRHVRRRL 24 Zm/G4001 30%20-61  CELCGGAAAVH 56 70% CAADSAFLCLR CDAKVHGANFL ASRHVRRRL 16 Os/G401232% 15-56  CELCGGVAAVH 52 67% CAADSAFLCLV CDDKVHGANFL ASRHRRRRL 20Os/G4298 67% 15-55  CELCGGVAAVH 54 67% CAADSAFLCLV CDDKVHGANFL ASRHPRRR14 Os/G4011 33% 8-49 CALCGAAAAVH 51 65% CEADAAFLCAA CDAKVHGANFLASRHHRRRV 8 Zm/G4000 30% 20-61  CELCGGAAAVH 48 65% CAADSAFLCLRCDAKVHGANFL ASRHVRRRL 66 Ta/Ta1988 33% 13-54  CELCGGVAAVH 68 61%CAADSAFLCVP CDAKVHGANFL ASRHLRRRL 4 Gm/G4004 33% 6-51 CELCHQLASLY 46 60%CPSDSAFLCFH CDAAVHAANFL VARHLRRLLCS KC 6 Gm/G4005 32% 6-51 CELCDQQASLY47 60% CPSDSAFLCSD CDAAVHAANFL VARHLRRLLCS KC 10 Ct/G4007 45% 5-50CELCSQEAALH 49 58% CASDEAFLCFD CDDRVHKANFL VARHVRQTLCS QC 22 Le/G429936% 9-54 CELCNDQAALF 55 58% CPSDSAFLCFH CDAKVHQANFL VARHLRLTLCS HC 12Pt/G4009 40% 6-51 CELCKGEAGVY 50 56% CDSDAAYLCFD CDSNVHNANFL VARHIRRVICSGC Species abbreviations for Table 1: At—Arabidopsis thaliana; Ct—Citrussinensis; Gm—Glycine max; Le—Lycopersicon esculentum; Os—Oryza saliva;Pt—Populus trichocarpa; Ta—Triticum aestivum; Zm—Zea mays. ¹phenotypeobserved in both Arabidopsis and soy plants

Tables 2 and 3 list some of the morphological and physiological traitsthat conferred to Arabidopsis, soy or corn plants overexpressing G1988or orthologs from diverse species of plants, including Arabidopsis, soy,may, rice, and tomato, in experiments conducted to date. Allobservations are made with respect to control plants that did notoverexpress a G1988 clade transcription factor.

TABLE 2 G1988 homologs and potentially valuable morphology-relatedtraits Col. 2 Col. 1 Reduced light response: Col. 4 Col. 5 GID (SEQelongated hypocotyls, Col. 3 Increased Delayed development ID No.)elongated petioles or Increased secondary and/or time to Species uprightleaves yield* roots flowering G1988 (2) +¹ +³ +¹ +^(1, 3) At G4004 (4)+¹ n/d n/d +¹ Gm G4005 (6) +¹ n/d n/d +¹ Gm G4000 (8) +¹ n/d n/d +¹ ZmG4012 (16) +¹ n/d n/d +¹ Os G4299 (22) +¹ n/d n/d +¹ Sl *yield may beincreased by morphological improvements and/or increased tolerance tovarious physiological stresses

TABLE 3 G1988 homologs and potentially valuable physiological traitsCol. 2 Col. 3 Col. 4 Col. 6 Col. 1 Better Increased Altered C/N Col. 5Increased GID (SEQ germination water sensing or Increased hyperosmoticID No.) in cold deprivation low N low P stress (sucrose) Speciesconditions tolerance tolerance tolerance tolerance G1988 (2) +³ +^(1, 3)+¹ +¹ +¹ At G4004 (4) +^(1, 2, 3) n/d +^(1, 2) −¹ Gm G4005 (6) −¹ +¹ −¹Gm G4000 (8) n/d n/d n/d n/d n/d Zm G4012 (16) n/d n/d n/d n/d n/d OsG4299 (22) n/d n/d n/d n/d n/d Sl

Species abbreviations for Tables 2 and 3: At—Arabidopsis thaliana;Gm—Glycine max; Os—Oryza sativa; Sl—Solanum lycopersicum; Zm—Zea mays

-   -   (+) indicates positive assay result/more tolerant or phenotype        observed, relative to controls.    -   (−) indicates negative assay result/less tolerant or phenotype        observed, relative to controls    -   empty cell—assay result similar to controls    -   ¹ phenotype observed in Arabidopsis plants    -   ² phenotype observed in maize plants    -   ³ phenotype observed in soy plants    -   n/d—assay not yet done or completed    -   N—Altered C/N sensing or low nitrogen tolerance    -   P—phosphorus

Water deprivation tolerance was indicated in soil-based drought orplate-based desiccation assays

Hyperosmotic stress was indicated by greater tolerance to 9.4% sucrosethan controls

Increased cold tolerance was indicated by greater tolerance to 8° C.during germination or growth than controls

-   -   Altered C/N sensing or low nitrogen tolerance assays were        conducted in basal media minus nitrogen plus 3% sucrose or basal        media minus nitrogen plus 3% sucrose and 1 mM glutamine; for the        nitrogen limitation assay, the nitrogen source of 80% MS medium        was reduced to 20 mg/L of NH₄NO₃.

Increased low P tolerance was indicated by better growth in MS mediumlacking a phosphorus source

A reduced light sensitivity phenotype was indicated by longer petioles,longer hypocotyls and/or upturned leaves relative to control plants

-   -   n/d—assay not yet done or completed

Orthologs and Paralogs

Homologous sequences as described above can comprise orthologous orparalogous sequences. Several different methods are known by those ofskill in the art for identifying and defining these functionallyhomologous sequences. General methods for identifying orthologs andparalogs, including phylogenetic methods, sequence similarity andhybridization methods, are described herein; an ortholog or paralog,including equivalogs, may be identified by one or more of the methodsdescribed below.

As described by Eisen (1998) Genome Res. 8: 163-167, evolutionaryinformation may be used to predict gene function. It is common forgroups of genes that are homologous in sequence to have diverse,although usually related, functions. However, in many cases, theidentification of homologs is not sufficient to make specificpredictions because not all homologs have the same function. Thus, aninitial analysis of functional relatedness based on sequence similarityalone may not provide one with a means to determine where similarityends and functional relatedness begins. Fortunately, it is well known inthe art that protein function can be classified using phylogeneticanalysis of gene trees combined with the corresponding species.Functional predictions can be greatly improved by focusing on how thegenes became similar in sequence (i.e., by evolutionary processes)rather than on the sequence similarity itself (Eisen, supra). In fact,many specific examples exist in which gene function has been shown tocorrelate well with gene phylogeny (Eisen, supra). Thus, “[t]he firststep in making functional predictions is the generation of aphylogenetic tree representing the evolutionary history of the gene ofinterest and its homologs. Such trees are distinct from clusters andother means of characterizing sequence similarity because they areinferred by techniques that help convert patterns of similarity intoevolutionary relationships . . . . After the gene tree is inferred,biologically determined functions of the various homologs are overlaidonto the tree. Finally, the structure of the tree and the relativephylogenetic positions of genes of different functions are used to tracethe history of functional changes, which is then used to predictfunctions of [as yet] uncharacterized genes” (Eisen, supra).

Within a single plant species, gene duplication may cause two copies ofa particular gene, giving rise to two or more genes with similarsequence and often similar function known as paralogs. A paralog istherefore a similar gene formed by duplication within the same species.Paralogs typically cluster together or in the same clade (a group ofsimilar genes) when a gene family phylogeny is analyzed using programssuch as CLUSTAL (Thompson et al. (1994); Higgins et al. (1996)). Groupsof similar genes can also be identified with pair-wise BLAST analysis(Feng and Doolittle (1987)). For example, a clade of very similar MADSdomain transcription factors from Arabidopsis all share a commonfunction in flowering time (Ratcliffe et al. (2001)), and a group ofvery similar AP2 domain transcription factors from Arabidopsis areinvolved in tolerance of plants to freezing (Gilmour et al. (1998)).Analysis of groups of similar genes with similar function that fallwithin one clade can yield sub-sequences that are particular to theclade. These sub-sequences, known as consensus sequences, can not onlybe used to define the sequences within each clade, but define thefunctions of these genes; genes within a clade may contain paralogoussequences, or orthologous sequences that share the same function (seealso, for example, Mount (2001))

Transcription factor gene sequences are conserved across diverseeukaryotic species lines (Goodrich et al. (1993); Lin et al. (1991);Sadowski et al. (1988)). Plants are no exception to this observation;diverse plant species possess transcription factors that have similarsequences and functions. Speciation, the production of new species froma parental species, gives rise to two or more genes with similarsequence and similar function. These genes, termed orthologs, often havean identical function within their host plants and are ofteninterchangeable between species without losing function. Because plantshave common ancestors, many genes in any plant species will have acorresponding orthologous gene in another plant species. Once aphylogenic tree for a gene family of one species has been constructedusing a program such as CLUSTAL (Thompson et al. (1994); Higgins et al.(1996)) potential orthologous sequences can be placed into thephylogenetic tree and their relationship to genes from the species ofinterest can be determined. Orthologous sequences can also be identifiedby a reciprocal BLAST strategy. Once an orthologous sequence has beenidentified, the function of the ortholog can be deduced from theidentified function of the reference sequence.

By using a phylogenetic analysis, one skilled in the art would recognizethat the ability to deduce similar functions conferred byclosely-related polypeptides is predictable. This predictability hasbeen confirmed by our own many studies in which we have found that awide variety of polypeptides have orthologous or closely-relatedhomologous sequences that function as does the first, closely-relatedreference sequence. For example, distinct transcription factors,including:

(i) AP2 family Arabidopsis G47 (found in U.S. Pat. No. 7,135,616), aphylogenetically-related sequence from soybean, and twophylogenetically-related homologs from rice all can confer greatertolerance to drought, hyperosmotic stress, or delayed flowering ascompared to control plants;

(ii) CAAT family Arabidopsis G481 (found in PCT patent publicationWO2004076638), and numerous phylogenetically-related sequences fromeudicots and monocots can confer greater tolerance to drought-relatedstress as compared to control plants;

(iii) Myb-related Arabidopsis G682 (found in U.S. Pat. Nos. 7,223,904and 7,193,129) and numerous phylogenetically-related sequences fromeudicots and monocots can confer greater tolerance to heat,drought-related stress, cold, and salt as compared to control plants;

(iv) WRKY family Arabidopsis G1274 (found in U.S. Pat. No. 7,196,245)and numerous closely-related sequences from eudicots and monocots havebeen shown to confer increased water deprivation tolerance, and

(v) AT-hook family soy sequence G3456 (found in US patent publication20040128712A1) and numerous phylogenetically-related sequences fromeudicots and monocots, increased biomass compared to control plants whenthese sequences are overexpressed in plants.

The polypeptides sequences belong to distinct clades of polypeptidesthat include members from diverse species. In each case, most or all ofthe clade member sequences derived from both eudicots and monocots havebeen shown to confer increased yield or tolerance to one or more abioticstresses when the sequences were overexpressed. These studies eachdemonstrate that evolutionarily conserved genes from diverse species arelikely to function similarly (i.e., by regulating similar targetsequences and controlling the same traits), and that polynucleotidesfrom one species may be transformed into closely-related ordistantly-related plant species to confer or improve traits.

As shown in Table 1, polypeptides that are phylogenetically related tothe polypeptides of the invention may have conserved domains that shareat least 56%, 58%, 60/c, 65%, 67%, or 70%, 75%, 80%, 85%, 90%, or 95%amino acid sequence identity, and have similar functions in that thepolypeptides of the invention may, when overexpressed, confer at leastone regulatory activity selected from the group consisting of greateryield, more rapid growth, greater size, increased secondary rooting,greater cold tolerance, greater tolerance to water deprivation, reducedstomatal conductance, altered C/N sensing or increased low nitrogentolerance, increased low phosphorus tolerance, increased tolerance tohyperosmotic stress, and/or reduced light sensitivity as compared to acontrol plant.

At the nucleotide level, the sequences of the invention will typicallyshare at least about 30% or 40% nucleotide sequence identity, preferablyat least about 50%, about 60%, about 70% or about 80% sequence identity,and more preferably about 85%, about 90%, about 95% or about 97% or moresequence identity to one or more of the listed full-length sequences, orto a listed sequence but excluding or outside of the region(s) encodinga known consensus sequence or consensus DNA-binding site, or outside ofthe region(s) encoding one or all conserved domains. The degeneracy ofthe genetic code enables major variations in the nucleotide sequence ofa polynucleotide while maintaining the amino acid sequence of theencoded protein.

Percent identity can be determined electronically, e.g., by using theMEGALIGN program (DNASTAR, Inc. Madison, Wis.). The MEGALIGN program cancreate alignments between two or more sequences according to differentmethods, for example, the clustal method (see, for example, Higgins andSharp (1988). The clustal algorithm groups sequences into clusters byexamining the distances between all pairs. The clusters are alignedpairwise and then in groups. Other alignment algorithms or programs maybe used, including FASTA, BLAST, or ENTREZ, FASTA and BLAST, and whichmay be used to calculate percent similarity. These are available as apart of the GCG sequence analysis package (University of Wisconsin,Madison, Wis.), and can be used with or without default settings. ENTREZis available through the National Center for Biotechnology Information.In one embodiment, the percent identity of two sequences can bedetermined by the GCG program with a gap weight of 1, e.g., each aminoacid gap is weighted as if it were a single amino acid or nucleotidemismatch between the two sequences (see U.S. Pat. No. 6,262,333).

Software for performing BLAST analyses is publicly available, e.g.,through the National Center for Biotechnology Information (see internetwebsite at http://www.ncbi.nlm.nih.gov/). This algorithm involves firstidentifying high scoring sequence pairs (HSPs) by identifying shortwords of length W in the query sequence, which either match or satisfysome positive-valued threshold score T when aligned with a word of thesame length in a database sequence. T is referred to as the neighborhoodword score threshold (Altschul (1990); Altschul et al. (1993)). Theseinitial neighborhood word hits act as seeds for initiating searches tofind longer HSPs containing them. The word hits are then extended inboth directions along each sequence for as far as the cumulativealignment score can be increased. Cumulative scores are calculatedusing, for nucleotide sequences, the parameters M (reward score for apair of matching residues; always >0) and N (penalty score formismatching residues; always <0). For amino acid sequences, a scoringmatrix is used to calculate the cumulative score. Extension of the wordhits in each direction are halted when: the cumulative alignment scorefalls off by the quantity X from its maximum achieved value; thecumulative score goes to zero or below, due to the accumulation of oneor more negative-scoring residue alignments; or the end of eithersequence is reached. The BLAST algorithm parameters W, T, and Xdetermine the sensitivity and speed of the alignment. The BLASTN program(for nucleotide sequences) uses as defaults a wordlength (W) of 11, anexpectation (E) of 10, a cutoff of 100, M=5, N=−4, and a comparison ofboth strands. For amino acid sequences, the BLASTP program uses asdefaults a wordlength (W) of 3, an expectation (E) of 10, and theBLOSUM62 scoring matrix (see Henikoff & Henikoff (1989, 1991)). Unlessotherwise indicated for comparisons of predicted polynucleotides,“sequence identity” refers to the % sequence identity generated from atblastx using the NCBI version of the algorithm at the default settingsusing gapped alignments with the filter “off” (see, for example,internet website at http://www.ncbi.nlm.nih.gov/).

Other techniques for alignment are described by Doolittle (1996).Preferably, an alignment program that permits gaps in the sequence isutilized to align the sequences. The Smith-Waterman is one type ofalgorithm that permits gaps in sequence alignments (see Shpaer (1997).Also, the GAP program using the Needleman and Wunsch alignment methodcan be utilized to align sequences. An alternative search strategy usesMPSRCH software, which runs on a MASPAR computer. MPSRCH uses aSmith-Waterman algorithm to score sequences on a massively parallelcomputer. This approach improves ability to pick up distantly relatedmatches, and is especially tolerant of small gaps and nucleotidesequence errors. Nucleic acid-encoded amino acid sequences can be usedto search both protein and DNA databases.

The percentage similarity between two polypeptide sequences, e.g.,sequence A and sequence B, is calculated by dividing the length ofsequence A, minus the number of gap residues in sequence A, minus thenumber of gap residues in sequence B, into the sum of the residuematches between sequence A and sequence B, times one hundred. Gaps oflow or of no similarity between the two amino acid sequences are notincluded in determining percentage similarity. Percent identity betweenpolynucleotide sequences can also be counted or calculated by othermethods known in the art, e.g., the Jotun Hein method (see, for example,Hein (1990)) Identity between sequences can also be determined by othermethods known in the art, e.g., by varying hybridization conditions (seeUS Patent Application No. 20010010913).

Thus, the invention provides methods for identifying a sequence similaror paralogous or orthologous or homologous to one or morepolynucleotides as noted herein, or one or more target polypeptidesencoded by the polynucleotides, or otherwise noted herein and mayinclude linking or associating a given plant phenotype or gene functionwith a sequence. In the methods, a sequence database is provided(locally or across an internet or intranet) and a query is made againstthe sequence database using the relevant sequences herein and associatedplant phenotypes or gene functions.

In addition, one or more polynucleotide sequences or one or morepolypeptides encoded by the polynucleotide sequences may be used tosearch against a BLOCKS (Bairoch et al. (1997)), PFAM, and otherdatabases which contain previously identified and annotated motifs,sequences and gene functions. Methods that search for primary sequencepatterns with secondary structure gap penalties (Smith et al. (1992)) aswell as algorithms such as Basic Local Alignment Search Tool (BLAST;Altschul (1990); Altschul et al. (1993)), BLOCKS (Henikoff and Henikoff(1991)), Hidden Markov Models (HMM; Eddy (1996); Sonnhammer et al.(1997)), and the like, can be used to manipulate and analyzepolynucleotide and polypeptide sequences encoded by polynucleotides.These databases, algorithms and other methods are well known in the artand are described in Ausubel et al. (1997), and in Meyers (1995).

A further method for identifying or confirming that specific homologoussequences control the same function is by comparison of the transcriptprofile(s) obtained upon overexpression or knockout of two or morerelated polypeptides. Since transcript profiles are diagnostic forspecific cellular states, one skilled in the art will appreciate thatgenes that have a highly similar transcript profile (e.g., with greaterthan 50% regulated transcripts in common, or with greater than 70%regulated transcripts in common, or with greater than 90% regulatedtranscripts in common) will have highly similar functions. Fowler andThomashow (2002), have shown that three paralogous AP2 family genes(CBF1, CBF2 and CBF3) are induced upon cold treatment, and each of whichcan condition improved freezing tolerance, and all have highly similartranscript profiles. Once a polypeptide has been shown to provide aspecific function, its transcript profile becomes a diagnostic tool todetermine whether paralogs or orthologs have the same function.

Furthermore, methods using manual alignment of sequences similar orhomologous to one or more polynucleotide sequences or one or morepolypeptides encoded by the polynucleotide sequences may be used toidentify regions of similarity and B-box zinc finger domains. Suchmanual methods are well-known of those of skill in the art and caninclude, for example, comparisons of tertiary structure between apolypeptide sequence encoded by a polynucleotide that comprises a knownfunction and a polypeptide sequence encoded by a polynucleotide sequencethat has a function not yet determined. Such examples of tertiarystructure may comprise predicted alpha helices, beta-sheets, amphipathichelices, leucine zipper motifs, zinc finger motifs, proline-richregions, cysteine repeat motifs, and the like.

Orthologs and paralogs of presently disclosed polypeptides may be clonedusing compositions provided by the present invention according tomethods well known in the art. cDNAs can be cloned using mRNA from aplant cell or tissue that expresses one of the present sequences.Appropriate mRNA sources may be identified by interrogating Northernblots with probes designed from the present sequences, after which alibrary is prepared from the mRNA obtained from a positive cell ortissue. Polypeptide-encoding cDNA is then isolated using, for example,PCR, using primers designed from a presently disclosed gene sequence, orby probing with a partial or complete cDNA or with one or more sets ofdegenerate probes based on the disclosed sequences. The cDNA library maybe used to transform plant cells. Expression of the cDNAs of interest isdetected using, for example, microarrays, Northern blots, quantitativePCR, or any other technique for monitoring changes in expression.Genomic clones may be isolated using similar techniques to those.

Examples of orthologs of the Arabidopsis polypeptide sequences and theirfunctionally similar orthologs are listed in Table 1 and the SequenceListing. In addition to the sequences in Table 1 and the SequenceListing, the invention encompasses isolated nucleotide sequences thatare phylogenetically and structurally similar to sequences listed in theSequence Listing) and can function in a plant by increasing yield and/orand abiotic stress tolerance when ectopically expressed in a plant.

Since a significant number of these sequences are phylogenetically andsequentially related to each other and have been shown to increase yieldfrom a plant and/or abiotic stress tolerance, one skilled in the artwould predict that other similar, phylogenetically related sequencesfalling within the present clades of polypeptides would also performsimilar functions when ectopically expressed.

Identifying Polynucleotides or Nucleic Acids by Hybridization

Polynucleotides homologous to the sequences illustrated in the SequenceListing and tables can be identified, e.g., by hybridization to eachother under stringent or under highly stringent conditions. Singlestranded polynucleotides hybridize when they associate based on avariety of well characterized physical-chemical forces, such as hydrogenbonding, solvent exclusion, base stacking and the like. The stringencyof a hybridization reflects the degree of sequence identity of thenucleic acids involved, such that the higher the stringency, the moresimilar are the two polynucleotide strands. Stringency is influenced bya variety of factors, including temperature, salt concentration andcomposition, organic and non-organic additives, solvents, etc. presentin both the hybridization and wash solutions and incubations (and numberthereof), as described in more detail in the references cited below(e.g., Sambrook et al. (1989); Berger and Kimmel (1987); and Andersonand Young (1985)).

Encompassed by the invention are polynucleotide sequences that arecapable of hybridizing to the claimed polynucleotide sequences,including any of the polynucleotides within the Sequence Listing, andfragments thereof under various conditions of stringency (see, forexample, Wahl and Berger (1987); and Kimmel (1987)). In addition to thenucleotide sequences listed in the Sequence Listing, full length cDNA,orthologs, and paralogs of the present nucleotide sequences may beidentified and isolated using well-known methods. The cDNA libraries,orthologs, and paralogs of the present nucleotide sequences may bescreened using hybridization methods to determine their utility ashybridization target or amplification probes.

With regard to hybridization, conditions that are highly stringent, andmeans for achieving them, are well known in the art. See, for example,Sambrook et al. (1989); Berger (1987), pages 467-469; and Anderson andYoung (1985).

Stability of DNA duplexes is affected by such factors as basecomposition, length, and degree of base pair mismatch. Hybridizationconditions may be adjusted to allow DNAs of different sequencerelatedness to hybridize. The melting temperature (T_(m)) is defined asthe temperature when 50% of the duplex molecules have dissociated intotheir constituent single strands. The melting temperature of a perfectlymatched duplex, where the hybridization buffer contains formamide as adenaturing agent, may be estimated by the following equations:

T _(m)(° C.)=81.5+16.6(log [Na+])+0.41(% G+C)−0.62(%formamide)−500/L  (I) DNA-DNA:

T _(m)(° C.)=79.8+18.5(log [Na+])+0.58(% G+C)+0.12(% G+C)²−0.5(%formamide)−820/L  (II) DNA-RNA:

T _(m)(° C.)=79.8+18.5(log [Na+])+0.58(% G+C)+0.12(% G+C)²−0.35(%formamide)−820/L  (III) RNA-RNA:

where L is the length of the duplex formed, [Na+] is the molarconcentration of the sodium ion in the hybridization or washingsolution, and % G+C is the percentage of (guanine+cytosine) bases in thehybrid. For imperfectly matched hybrids, approximately 1° C. is requiredto reduce the melting temperature for each 1% mismatch.

Hybridization experiments are generally conducted in a buffer of pHbetween 6.8 to 7.4, although the rate of hybridization is nearlyindependent of pH at ionic strengths likely to be used in thehybridization buffer (Anderson and Young (1985)). In addition, one ormore of the following may be used to reduce non-specific hybridization:sonicated salmon sperm DNA or another non-complementary DNA, bovineserum albumin, sodium pyrophosphate, sodium dodecylsulfate (SDS),polyvinyl-pyrrolidone, ficoll and Denhardt's solution. Dextran sulfateand polyethylene glycol 6000 act to exclude DNA from solution, thusraising the effective probe DNA concentration and the hybridizationsignal within a given unit of time. In some instances, conditions ofeven greater stringency may be desirable or required to reducenon-specific and/or background hybridization. These conditions may becreated with the use of higher temperature, lower ionic strength andhigher concentration of a denaturing agent such as formamide.

Stringency conditions can be adjusted to screen for moderately similarfragments such as homologous sequences from distantly related organisms,or to highly similar fragments such as genes that duplicate functionalenzymes from closely related organisms. The stringency can be adjustedeither during the hybridization step or in the post-hybridizationwashes. Salt concentration, formamide concentration, hybridizationtemperature and probe lengths are variables that can be used to alterstringency (as described by the formula above). As a general guidelineshigh stringency is typically performed at T_(m)−5° C. to T_(m)−20° C.,moderate stringency at T_(m)−20° C. to T_(m)−35° C. and low stringencyat T_(m)−35° C. to T_(m)−50° C. for duplex >150 base pairs.Hybridization may be performed at low to moderate stringency (25-50° C.below T_(m)), followed by post-hybridization washes at increasingstringencies. Maximum rates of hybridization in solution are determinedempirically to occur at T_(m)−25° C. for DNA-DNA duplex and T_(m)−15° C.for RNA-DNA duplex. Optionally, the degree of dissociation may beassessed after each wash step to determine the need for subsequent,higher stringency wash steps.

High stringency conditions may be used to select for nucleic acidsequences with high degrees of identity to the disclosed sequences. Anexample of stringent hybridization conditions obtained in a filter-basedmethod such as a Southern or Northern blot for hybridization ofcomplementary nucleic acids that have more than 100 complementaryresidues is about 5° C. to 20° C. lower than the thermal melting point(T_(m)) for the specific sequence at a defined ionic strength and pH.Conditions used for hybridization may include about 0.02 M to about 0.15M sodium chloride, about 0.5% to about 5% casein, about 0.02% SDS orabout 0.1% N-laurylsarcosine, about 0.001 M to about 0.03 M sodiumcitrate, at hybridization temperatures between about 50° C. and about70° C. More preferably, high stringency conditions are about 0.02 Msodium chloride, about 0.5% casein, about 0.02% SDS, about 0.001 Msodium citrate, at a temperature of about 50° C. Nucleic acid moleculesthat hybridize under stringent conditions will typically hybridize to aprobe based on either the entire DNA molecule or selected portions,e.g., to a unique subsequence, of the DNA.

Stringent salt concentration will ordinarily be less than about 750 mMNaCl and 75 mM trisodium citrate. Increasingly stringent conditions maybe obtained with less than about 500 mM NaCl and 50 mM trisodiumcitrate, to even greater stringency with less than about 250 mM NaCl and25 mM trisodium citrate. Low stringency hybridization can be obtained inthe absence of organic solvent, e.g., formamide, whereas high stringencyhybridization may be obtained in the presence of at least about 35%formamide, and more preferably at least about 50% formamide. Stringenttemperature conditions will ordinarily include temperatures of at leastabout 30° C., more preferably of at least about 37° C., and mostpreferably of at least about 42° C. with formamide present. Varyingadditional parameters, such as hybridization time, the concentration ofdetergent, e.g., sodium dodecyl sulfate (SDS) and ionic strength, arewell known to those skilled in the art. Various levels of stringency areaccomplished by combining these various conditions as needed.

The washing steps that follow hybridization may also vary in stringency;the post-hybridization wash steps primarily determine hybridizationspecificity, with the most critical factors being temperature and theionic strength of the final wash solution. Wash stringency can beincreased by decreasing salt concentration or by increasing temperature.Stringent salt concentration for the wash steps will preferably be lessthan about 30 mM NaCl and 3 mM trisodium citrate, and most preferablyless than about 15 mM NaCl and 1.5 mM trisodium citrate.

Thus, hybridization and wash conditions that may be used to bind andremove polynucleotides with less than the desired homology to thenucleic acid sequences or their complements that encode the presentpolypeptides include, for example:

6×SSC at 65° C.;

50% formamide, 4×SSC at 42° C.; or

0.5×SSC, 0.1% SDS at 65° C.;

with, for example, two wash steps of 10-30 minutes each. Usefulvariations on these conditions will be readily apparent to those skilledin the art.

A person of skill in the art would not expect substantial variationamong polynucleotide species encompassed within the scope of the presentinvention because the highly stringent conditions set forth in the aboveformulae yield structurally similar polynucleotides.

If desired, one may employ wash steps of even greater stringency,including about 0.2×SSC, 0.1% SDS at 65° C. and washing twice, each washstep being about 30 minutes, or about 0.1×SSC, 0.1% SDS at 65° C. andwashing twice for 30 minutes. The temperature for the wash solutionswill ordinarily be at least about 25° C., and for greater stringency atleast about 42° C. Hybridization stringency may be increased further byusing the same conditions as in the hybridization steps, with the washtemperature raised about 3° C. to about 5° C., and stringency may beincreased even further by using the same conditions except the washtemperature is raised about 6° C. to about 9° C. For identification ofless closely related homologs, wash steps may be performed at a lowertemperature, e.g., 50° C.

An example of a low stringency wash step employs a solution andconditions of at least 25° C. in 30 mM NaCl, 3 mM trisodium citrate, and0.1% SDS over 30 minutes. Greater stringency may be obtained at 42° C.in 15 mM NaCl, with 1.5 mM trisodium citrate, and 0.1% SDS over 30minutes. Even higher stringency wash conditions are obtained at 65°C.-68° C. in a solution of 15 mM NaCl, 1.5 mM trisodium citrate, and0.1% SDS. Wash procedures will generally employ at least two final washsteps. Additional variations on these conditions will be readilyapparent to those skilled in the art (see, for example, US PatentApplication No. 20010010913).

Stringency conditions can be selected such that an oligonucleotide thatis perfectly complementary to the coding oligonucleotide hybridizes tothe coding oligonucleotide with at least about a 5-10× higher signal tonoise ratio than the ratio for hybridization of the perfectlycomplementary oligonucleotide to a nucleic acid encoding a polypeptideknown as of the filing date of the application. It may be desirable toselect conditions for a particular assay such that a higher signal tonoise ratio, that is, about 15× or more, is obtained. Accordingly, asubject nucleic acid will hybridize to a unique coding oligonucleotidewith at least a 2× or greater signal to noise ratio as compared tohybridization of the coding oligonucleotide to a nucleic acid encodingknown polypeptide. The particular signal will depend on the label usedin the relevant assay, e.g., a fluorescent label, a colorimetric label,a radioactive label, or the like. Labeled hybridization or PCR probesfor detecting related polynucleotide sequences may be produced byoligolabeling, nick translation, end-labeling, or PCR amplificationusing a labeled nucleotide.

Encompassed by the invention are polynucleotide sequences that arecapable of hybridizing to the claimed polynucleotide sequences,including any of the polynucleotides within the Sequence Listing, andfragments thereof under various conditions of stringency (see, forexample, Wahl and Berger (1987), pages 399-407; and Kimmel (1987)). Inaddition to the nucleotide sequences in the Sequence Listing, fulllength cDNA, orthologs, and paralogs of the present nucleotide sequencesmay be identified and isolated using well-known methods. The cDNAlibraries, orthologs, and paralogs of the present nucleotide sequencesmay be screened using hybridization methods to determine their utilityas hybridization target or amplification probes.

EXAMPLES

It is to be understood that this invention is not limited to theparticular devices, machines, materials and methods described. Althoughparticular embodiments are described, equivalent embodiments may be usedto practice the invention.

The invention, now being generally described, will be more readilyunderstood by reference to the following examples, which are includedmerely for purposes of illustration of certain aspects and embodimentsof the present invention and are not intended to limit the invention. Itwill be recognized by one of skill in the art that a polypeptide that isassociated with a particular first trait may also be associated with atleast one other, unrelated and inherent second trait which was notpredicted by the first trait.

Example I. Project Types and Vector and Cloning Information

A number of constructs were used to modulate the activity of sequencesof the invention. An individual project was defined as the analysis oflines for a particular construct (for example, this might include G1988lines that constitutively overexpressed a sequence of the invention). Inthe present study, a full-length wild-type version of a gene wasdirectly fused to a promoter that drove its expression in transgenicplants. Such a promoter could be the native promoter of that gene, or aconstitutive promoter such as the cauliflower mosaic virus 35S promoter.Alternatively, a promoter that drives tissue specific or conditionalexpression could be used in similar studies.

In the present study, expression of a given polynucleotide from aparticular promoter was achieved by a direct-promoter fusion constructin which that sequence was cloned directly behind the promoter ofinterest. A direct fusion approach has the advantage of allowing forsimple genetic analysis if a given promoter-polynucleotide line is to becrossed into different genetic backgrounds at a later date.

For analysis of G1988-overexpressing plants, transgenic lines werecreated with the expression vector P2499 (SEQ ID NO: 59), whichcontained a G1988 cDNA clone. This construct constituted a 35S::G1988direct promoter-fusion carrying a kanamycin resistance marker and wasintroduced into Arabidopsis plants.

G4004 (polynucleotide SEQ ID NO: 3 and polypeptide SEQ ID NO: 4) is asequence derived from soybean. G4004 was identified as a closely-relatedhomolog of G1988 based on phylogenetic analysis described above. P26748(SEQ ID NO: 60) contained a G4004 cDNA clone, and was a 35S::G4004direct promoter-fusion construct carrying a kanamycin resistance marker.This construct was used to generate lines of transgenic Arabidopsisplants constitutively overexpressing the G4004 polypeptide.

G4005 (polynucleotide SEQ ID NO: 5 and polypeptide SEQ ID NO: 6) wasalso derived from soybean, and was also identified as a closely-relatedhomolog of G1988 based on phylogenetic analysis described above. P26749(SEQ ID NO: 61) contained a G4005 cDNA clone, and was a 35S::G4005direct promoter-fusion construct carrying a kanamycin resistance marker.This construct was used to generate lines of transgenic Arabidopsisplants constitutively overexpressing the G4005 polypeptide.

A list of constructs (these expression vectors are identified by a “PID”designation provided in the second column) used to created plantsoverexpressing G1988 clade members found in this report is provided inTable 4 and in the Sequence Listing.

TABLE 4 Expression constructs used to create plants overexpressing G1988clade members Gene Construct SEQ ID NO: Identifier (PID) of PID PromoterProject type G1988 (2) P2499 59 35S Direct At promoter-fusion G4004 (4)P26748 60 35S Direct Gm promoter-fusion G4005 (6) P26749 61 35S DirectGm promoter-fusion G4000 (8) P27404 63 35S Direct Zm promoter-fusionG4012 (16) P27406 64 35S Direct Os promoter-fusion G4299 (22) P27428 6535S Direct Sl promoter-fusion Species abbreviations for Table 4:At—Arabidopsis thaliana; Gm—Glycine max; Os—Oryza sativa; Sl—Solanumlycopersicum; Zm—Zea mays

Example H. Transformation

Transformation of Arabidopsis was performed by an Agrobacterium-mediatedprotocol based on the method of Bechtold and Pelletier (1998). Unlessotherwise specified, all experimental work was done using the Columbiaecotype.

Plant Preparation. Arabidopsis seeds were sown on mesh covered pots. Theseedlings were thinned so that 6-10 evenly spaced plants remained oneach pot 10 days after planting. The primary bolts were cut off a weekbefore transformation to break apical dominance and encourage auxiliaryshoots to form. Transformation was typically performed at 4-5 weeksafter sowing.

Bacterial culture preparation. Agrobacterium stocks were inoculated fromsingle colony plates or from glycerol stocks and grown with theappropriate antibiotics and grown until saturation. On the morning oftransformation, the saturated cultures were centrifuged and bacterialpellets were re-suspended in Infiltration Media (0.5×MS, 1×B5 Vitamins,5% sucrose, 1 mg/ml benzylaminopurine riboside, 200 μl/L Silwet L77)until an A600 reading of 0.8 was reached.

Transformation and seed harvest. The Agrobacterium solution was pouredinto dipping containers. All flower buds and rosette leaves of theplants were immersed in this solution for 30 seconds. The plants werelaid on their side and wrapped to keep the humidity high. The plantswere kept this way overnight at 4° C. and then the pots were turnedupright, unwrapped, and moved to the growth racks.

The plants were maintained on the growth rack under 24-hour light untilseeds were ready to be harvested. Seeds were harvested when 80% of thesiliques of the transformed plants were ripe (approximately 5 weeksafter the initial transformation). This seed was deemed T0 seed, sinceit was obtained from the T0 generation, and was later plated onselection plates (either kanamycin or sulfonamide). Resistant plantsthat were identified on such selection plates comprised the T1generation, from which transgenic seed comprising an expression vectorof interest could be derived.

Example III. Morphology Analysis

Morphological analysis was performed to determine whether changes inpolypeptide levels affect plant growth and development. This wasprimarily carried out on the T1 generation, when at least 10-20independent lines were examined. However, in cases where a phenotyperequired confirmation or detailed characterization, plants fromsubsequent generations were also analyzed.

Primary transformants were selected on MS medium with 0.3% sucrose and50 mg/kanamycin. T2 and later generation plants were selected in thesame manner, except that kanamycin was used at 35 mg/l. In cases wherelines carry a sulfonamide marker (as in all lines generated bysuper-transformation), seeds were selected on MS medium with 0.3%sucrose and 1.5 mg/sulfonamide. KO lines were usually germinated onplates without a selection. Seeds were cold-treated (stratified) onplates for three days in the dark (in order to increase germinationefficiency) prior to transfer to growth cabinets. Initially, plates wereincubated at 22° C. under a light intensity of approximately 100microEinsteins for 7 days. At this stage, transformants were green,possessed the first two true leaves, and were easily distinguished frombleached kanamycin or sulfonamide-susceptible seedlings. Resistantseedlings were then transferred onto soil (Sunshine potting mix).Following transfer to soil, trays of seedlings were covered with plasticlids for 2-3 days to maintain humidity while they became established.Plants were grown on soil under fluorescent light at an intensity of70-95 microEinsteins and a temperature of 18-23° C. Light conditionsconsisted of a 24-hour photoperiod unless otherwise stated. In instanceswhere alterations in flowering time were apparent, flowering time wasre-examined under both 12-hour and 24-hour light to assess whether thephenotype was photoperiod dependent. Under our 24-hour light growthconditions, the typical generation time (seed to seed) was approximately14 weeks.

Because many aspects of Arabidopsis development are dependent onlocalized environmental conditions, in all cases plants were evaluatedin comparison to controls in the same flat. As noted below, controls fortransgenic lines were wild-type plants or transgenic plants harboring anempty transformation vector selected on kanamycin or sulfonamide.Careful examination was made at the following stages: seedling (1 week),rosette (2-3 weeks), flowering (4-7 weeks), and late seed set (8-12weeks). Seed was also inspected. Seedling morphology was assessed onselection plates. At all other stages, plants were macroscopicallyevaluated while growing on soil. All significant differences (includingalterations in growth rate, size, leaf and flower morphology,coloration, and flowering time) were recorded, but routine measurementswere not taken if no differences were apparent. In certain cases, stemsections were stained to reveal lignin distribution. In these instances,hand-sectioned stems were mounted in phloroglucinol saturated 2M HCl(which stains lignin pink) and viewed immediately under a dissectionmicroscope.

Note that for a given project (gene-promoter combination, GAL4 fusionlines, RNAi lines etc.), ten lines were typically examined in subsequentplate based physiology assays.

Example IV. Physiology Experimental Methods

In subsequent Examples, unless otherwise indicted, morphological andphysiological traits are disclosed in comparison to wild-type controlplants. That is, a transformed plant that is described as large and/ordrought tolerant was large and more tolerant to drought with respect toa control plant, the latter including wild-type plants, parental linesand lines transformed with a vector that does not contain a sequence ofinterest. When a plant is said to have a better performance thancontrols, it generally was larger, had greater yield, and/or showed lessstress symptoms than control plants. The better performing lines may,for example, have produced less anthocyanin, or were larger, greener, ormore vigorous in response to a particular stress, as noted below. Betterperformance generally implies greater size or yield, or tolerance to aparticular biotic or abiotic stress, less sensitivity to ABA, or betterrecovery from a stress (as in the case of a soil-based droughttreatment) than controls.

Plate Assays. Different plate-based physiological assays (shown below),representing a variety of abiotic and water-deprivation-stress relatedconditions, were used as a pre-screen to identify top performing lines(i.e. lines from transformation with a particular construct), that weregenerally then tested in subsequent soil based assays. Typically, tenlines were subjected to plate assays, from which the best three lineswere selected for subsequent soil based assays. However, in projectswhere significant stress tolerance was not obtained in plate basedassays, lines were not submitted for soil assays.

In addition, some projects were subjected to nutrient limitationstudies. A nutrient limitation assay was intended to find genes thatallowed more plant growth upon deprivation of nitrogen. Nitrogen is amajor nutrient affecting plant growth and development that ultimatelyimpacts yield and stress tolerance. These assays monitored primarilyroot but also rosette growth on nitrogen deficient media. In all higherplants, inorganic nitrogen is first assimilated into glutamate,glutamine, aspartate and asparagine, the four amino acids used totransport assimilated nitrogen from sources (e.g. leaves) to sinks (e.g.developing seeds). This process may be regulated by light, as well as byC/N metabolic status of the plant. A C/N sensing assay was thus used tolook for alterations in the mechanisms plants use to sense internallevels of carbon and nitrogen metabolites which could activate signaltransduction cascades that regulate the transcription of N-assimilatorygenes. To determine whether these mechanisms are altered, we exploitedthe observation that wild-type plants grown on media containing highlevels of sucrose (3%) without a nitrogen source accumulate high levelsof anthocyanins. This sucrose induced anthocyanin accumulation can berelieved by the addition of either inorganic or organic nitrogen. Weused glutamine as a nitrogen source since it also serves as a compoundused to transport N in plants.

Germination assays. The following germination assays were conducted withArabidopsis overexpressors of G1988 and closely-related sequences: NaCl(150 mM), mannitol (300 mM), sucrose (9.4%), ABA (0.3 μM), cold (8° C.),polyethylene glycol (10%, with Phytogel as gelling agent), or C/Nsensing or low nitrogen medium. In the text below, —N refers to basalmedia minus nitrogen plus 3% sucrose and -N/+Gln is basal media minusnitrogen plus 3% sucrose and 1 mM glutamine.

All germination assays were performed in tissue culture. Growing theplants under controlled temperature and humidity on sterile mediumproduces uniform plant material that has not been exposed to additionalstresses (such as water stress) which could cause variability in theresults obtained. All assays were designed to detect plants that weremore tolerant or less tolerant to the particular stress condition andwere developed with reference to the following publications: Jang et al.(1997), Smeekens (1998), Liu and Zhu (1997), Saleki et al. (1993), Wu etal. (1996), Zhu et al. (1998), Alia et al. (1998), Xin and Browse,(1998), Leon-Kloosterziel et al. (1996). Where possible, assayconditions were originally tested in a blind experiment with controlsthat had phenotypes related to the condition tested.

Prior to plating, seed for all experiments were surface sterilized inthe following manner (1) 5 minute incubation with mixing in 70% ethanol,(2) 20 minute incubation with mixing in 30% bleach, 0.01% triton-X 100,(3) 5× rinses with sterile water, (4) Seeds were re-suspended in 0.1%sterile agarose and stratified at 4° C. for 3-4 days.

All germination assays follow modifications of the same basic protocol.Sterile seeds were sown on the conditional media that has a basalcomposition of 80% MS+Vitamins. Plates were incubated at 22° C. under24-hour light (120-130 μE m⁻² s⁻¹) in a growth chamber. Evaluation ofgermination and seedling vigor was performed five days after planting.

Growth assay. The following growth assays were conducted withArabidopsis overexpressors of G1988 and closely-related sequences:severe desiccation (a type of water deprivation assay), growth in coldconditions at 8° C., root development (visual assessment of lateral andprimary roots, root hairs and overall growth), and phosphate limitation.For the nitrogen limitation assay, plants were grown in 80% Murashigeand Skoog (MS) medium in which the nitrogen source was reduced to 20mg/L of NH₄NO₃. Note that 80% MS normally has 1.32 g/L NH₄NO₃ and 1.52g/L KNO₃. For phosphate limitation assays, seven day old seedlings weregerminated on phosphate-free medium in MS medium in which KH₂PO₄ wasreplaced by K₂SO₄.

Unless otherwise stated, all experiments were performed with theArabidopsis thaliana ecotype Columbia (col-0), soybean or maize plants.Assays were usually conducted on non-selected segregating T2 populations(in order to avoid the extra stress of selection). Control plants forassays on lines containing direct promoter-fusion constructs were Col-0plants transformed an empty transformation vector (pMEN65). Controls for2-component lines (generated by supertransformation) were the backgroundpromoter-driver lines (i.e. promoter::LexA-GAL4TA lines), into which thesupertransformations were initially performed.

Procedures

For chilling growth assays, seeds were germinated and grown for sevendays on MS+Vitamins+1% sucrose at 22° C. and then transferred tochilling conditions at 8° C. and evaluated after another 10 days and 17days.

For severe desiccation (plate-based water deprivation) assays, seedlingswere grown for 14 days on MS+Vitamins+1% Sucrose at 22° C. Plates wereopened in the sterile hood for 3 hr for hardening and then seedlingswere removed from the media and let dry for two hours in the hood. Afterthis time the plants were transferred back to plates and incubated at22° C. for recovery. The plants were then evaluated after five days.

For the polyethylene glycol (PEG) hyperosmotic stress tolerance screen,plant seeds were gas sterilized with chlorine gas for 2 hrs. The seedswere plated on each plate containing 3% PEG, 1/2×MS salts, 1% phytagel,and 10 μg/ml glufosinate-ammonium (BASTA). Two replicate plates per seedline were planted. The plates were placed at 4° C. for 3 days tostratify seeds. The plates were held vertically for 11 additional daysat temperatures of 22° C. (day) and 20° C. (night). The photoperiod was16 hrs. with an average light intensity of about 120 μmol/m2/s. Theracks holding the plates were rotated daily within the shelves of thegrowth chamber carts. At 11 days, root length measurements are made. At14 days, seedling status was determined, root length was measured,growth stage was recorded, the visual color was assessed, pooledseedling fresh weight was measured, and a whole plate photograph wastaken.

Wilt screen assay. Transgenic and wild-type soybelants were grown in 5″pots in growth chambers. After the seedlings reached the V1 stage (theV1 stage occurs when the plants have one trifoliolate, and theunifoliolate and first trifoliolate leaves are unrolled), water waswithheld and the drought treatment thus started. A drought injuryphenotype score was recorded, in increasing severity of effect, as 1 to4, with 1 designated no obvious effect and 4 indicating a dead plant.Drought scoring was initiated as soon as one plant in one growth chamberhad a drought score of 1.5. Scoring continued every day until at least90% of the wild type plants had achieved scores of 3.5 or more. At theend of the experiment the scores for both transgenic and wild typesoybean seedlings were statistically analyzed using Risk Score andSurvival analysis methods (Glantz (2001); Hosmer and Lemeshow (1999)).

Water use efficiency (WUE). WUE was estimated by exploiting theobservation that elements can exist in both stable and unstable(radioactive) forms. Most elements of biological interest (including C,H, O, N, and S) have two or more stable isotopes, with the lightest ofthese present in much greater abundance than the others. For example,¹²C is more abundant than ¹³C in nature (¹²C=98.89%, ¹³C=1.11%,¹⁴C=<10-10%). Because ¹³C is slightly larger than ¹²C, fractionation ofCO₂ during photosynthesis occurs at two steps:

1. ¹²CO₂ diffuses through air and into the leaf more easily;

2. ¹²CO₂ is preferred by the enzyme in the first step of photosynthesis,ribulose bisphosphate carboxylase/oxygenase.

WUE has been shown to be negatively correlated with carbon isotopediscrimination during photosynthesis in several C3 crop species. Carbonisotope discrimination has also been linked to drought tolerance andyield stability in drought-prone environments and has been successfullyused to identify genotypes with better drought tolerance. ¹³C/¹²Ccontent was measured after combustion of plant material and conversionto CO₂, and analysis by mass spectroscopy. With comparison to a knownstandard, ¹³C content was altered in such a way as to suggest thatoverexpression of G1988 or related sequences improves water useefficiency.

Another potential indicator of WUE was stomatal conductance, that is,the extent to which stomata were open.

Data Interpretation

At the time of evaluation, plants were given one of the followingscores:

-   (++) Substantially enhanced performance compared to controls. The    phenotype was very consistent and growth was significantly above the    normal levels of variability observed for that assay.-   (+) Enhanced performance compared to controls. The response was    consistent but was only moderately above the normal levels of    variability observed for that assay.-   (wt) No detectable difference from wild-type controls.-   (−) Impaired performance compared to controls. The response was    consistent but was only moderately above the normal levels of    variability observed for that assay.-   (−−) Substantially impaired performance compared to controls. The    phenotype was consistent and growth was significantly above the    normal levels of variability observed for that assay.-   (n/d) Experiment failed, data not obtained, or assay not performed.

Example V. Soil Drought (Clay Pot)

The soil drought assay (performed in clay pots) was based on thatdescribed by Haake et al. (2002).

Experimental Procedure.

Previously, we have performed clay-pot assays on segregating T2populations, sown directly to soil. However, in the current procedure,seedlings were first germinated on selection plates containing eitherkanamycin or sulfonamide.

Seeds were sterilized by a 2 minute ethanol treatment followed by 20minutes in 30% bleach/0.01% Tween and five washes in distilled water.Seeds were sown to MS agar in 0.1% agarose and stratified for three daysat 4° C., before transfer to growth cabinets with a temperature of 22°C. After seven days of growth on selection plates, seedlings weretransplanted to 3.5 inch diameter clay pots containing 80 g of a 50:50mix of vermiculite:perlite topped with 80 g of ProMix. Typically, eachpot contained 14 seedlings, and plants of the transgenic line beingtested were in separate pots to the wild-type controls. Pots containingthe transgenic line versus control pots were interspersed in the growthroom, maintained under 24-hour light conditions (18-23° C., and 90-100μE m⁻² s⁻¹) and watered for a period of 14 days. Water was then withheldand pots were placed on absorbent paper for a period of 8-10 days toapply a drought treatment. After this period, a visual qualitative“drought score” from 0-6 was assigned to record the extent of visibledrought stress symptoms. A score of “6” corresponded to no visiblesymptoms whereas a score of “0” corresponded to extreme wilting and theleaves having a “crispy” texture. At the end of the drought period, potswere re-watered and scored after 5-6 days; the number of survivingplants in each pot was counted, and the proportion of the total plantsin the pot that survived was calculated.

Analysis of results. In a given experiment, we typically compared 6 ormore pots of a transgenic line with 6 or more pots of the appropriatecontrol. The mean drought score and mean proportion of plants surviving(survival rate) were calculated for both the transgenic line and thewild-type pots. In each case a p-value* was calculated, which indicatedthe significance of the difference between the two mean values. Theresults for each transgenic line across each planting for a particularproject were then presented in a results table.

Calculation of p-values. For the assays where control and experimentalplants were in separate pots, survival was analyzed with a logisticregression to account for the fact that the random variable is aproportion between 0 and 1. The reported p-value was the significance ofthe experimental proportion contrasted to the control, based uponregressing the logit-transformed data.

Drought score, being an ordered factor with no real numeric meaning, wasanalyzed with a non-parametric test between the experimental and controlgroups. The p-value was calculated with a Mann-Whitney rank-sum test.

Example VI. Soil Drought Physiological and Biochemical Measurements

These experiments determined the physiological basis for the droughttolerance conferred by each lead and were typically performed under soilgrown conditions. Usually, the experiment was performed underphotoperiodic conditions of 10-hr or 12-hr light. Where possible, agiven project (gene/promoter combination or protein variant) wasrepresented by three independent lines. Plants were usually at latevegetative/early reproductive stage at the time measurements were taken.Typically we assayed three different states: a well-watered state, amild-drought state and a moderately severe drought state. In each case,we made comparisons to wild-type plants with the same degree of physicalstress symptoms (wilting). To achieve this, staggered samplings wereoften required. Typically, for a given line, ten individual plants wereassayed for each state.

The following physiological parameters were routinely measured: relativewater content, ABA content, proline content, and photosynthesis rate. Insome cases, measurements of chlorophyll levels, starch levels,carotenoid levels, and chlorophyll fluorescence were also made.

Analysis of results. In a given experiment, for a particular parameter,we typically compared about 10 samples from a given transgenic line withabout 10 samples of the appropriate wild-type control at each droughtstate. The mean values for each physiological parameter were calculatedfor both the transgenic line and the wild-type pots. In each case, ap-value (calculated via a simple t-test) was determined, which indicatedthe significance of the difference between the two mean values.

A typical procedure is described below; this corresponds to method usedfor the drought time-course experiment which we performed on wild-typeplants during our baseline studies at the outset of the drought program.

Procedure. stratified for three days at 4° C. in 0.1% agarose and sownon Metromix 200 in 2.25 inch pots (square or round). Plants weremaintained in individual pots within flats grown under short days (10hours light, 14 hours dark). Seedlings were watered as needed tomaintain healthy plant growth and development. At 7 to 8 weeks afterplanting, plants were used in drought experiments.

Plants matched for equivalent growth development (rosette size) wereremoved from plastic flats and placed on absorbent paper. Potscontaining plants used as well-watered controls were placed within aweigh boat and the dish placed on the diaper paper. The purpose of theweigh boat was to retain any water that might leak from well-wateredpots and affect pots containing plants undergoing the drought stresstreatment.

On each day of sampling, up to 18 plants subjected to drought conditionsand 6 well-watered controls (from each transgenic line) were picked froma randomly generated pool (given that they passed quality controlstandards). Biochemical analysis for photosynthesis, ABA, and prolinewas performed on the next three youngest, most fully expanded leaves.Relative water content was analyzed using the remaining rosette tissue.

Measurement of Photosynthesis. Photosynthesis was measured using a LICORLI-6400 (Li-Cor Biosciences, Lincoln, Nebr.). The LI-6400 used infraredgas analyzers to measure carbon dioxide to generate a photosynthesismeasurement. It was based upon the difference of the CO₂ reference (theamount put into the chamber) and the CO₂ sample (the amount that leavesthe chamber). Since photosynthesis is the process of converting CO₂ tocarbohydrates, we expected to see a decrease in the amount of CO₂sample. From this difference, a photosynthesis rate could be generated.In some cases, respiration may occur and an increase in CO₂ detected. Toperform measurements, the LI-6400 as set-up and calibrated as perLI-6400 standard directions. Photosynthesis was measured in theyoungest, most fully expanded leaf at 300 and 1000 ppm CO₂ using a metalhalide light source. This light source provided about 700 μE m⁻² s⁻¹.

Fluorescence was measured in dark and light adapted leaves using eithera LI-6400 (LICOR) with a leaf chamber fluorometer attachment or an OS-1(Opti-Sciences, Hudson, N.H.) as described in the manufacturer'sliterature. When the LI-6400 was used, all manipulations were performedunder a dark shade cloth. Plants were dark adapted by placing in a boxunder this shade cloth until used. The OS-30 utilized small clips tocreate dark adapted leaves.

Measurement of Abscisic Acid and Proline. The purpose of this experimentwas to measure ABA and proline in plant tissue. ABA is a plant hormonebelieved to be involved in stress responses and proline is anosmoprotectant.

Three of the youngest, most fully expanded mature leaves were harvested,frozen in liquid nitrogen, lyophilized, and a dry weight measurementtaken. Plant tissue was then homogenized in methanol to which 500 ng ofd6-ABA has been added to act as an internal standard. The homogenate wasfiltered to removed plant material and the filtrate evaporated to asmall volume. To this crude extract, approximately 3 ml of 1% aceticacid was added and the extract was further evaporated to remove anyremaining methanol. The volume of the remaining aqueous extract wasmeasured and a small aliquot (usually 200 to 500 μl) removed for prolineanalysis (Protocol described below). The remaining extract was thenpartitioned twice against ether, the ether removed by evaporation andthe residue methylated using ethereal diazomethane. Following removal ofany unreacted diazomethane, the residue was dissolved in 100 to 200 μlethyl acetate and analyzed by gas chromatography-mass spectrometry.Analysis was performed using an HP 6890 GC coupled to an HP 5973 MSDusing a DB-5 ms gas capillary column. Column pressure was 20 psi.Initially, the oven temperature was 150° C. Following injection, theoven was heated at 5° C./min to a final temperature of 250° C. ABAlevels was estimated using an isotope dilution equation and normalizedto tissue dry weight.

Free proline content was measured according to Bates (Bates et al.,1973). The crude aqueous extract obtained above was brought up to afinal volume of 500 μl using distilled water. Subsequently, 500 μl ofglacial acetic was added followed by 500 μl of Chinard's Ninhydrin.Chinard's Ninhydrin was prepared by dissolving 2.5 g ninhydrin(triketohydrindene hydrate) in 60 ml glacial acetic acid at 70° C. towhich 40 ml of 6 M phosphoric acid was added.

The samples were then heated at 95° to 100° C. for one hour. After thisincubation period, samples were cooled and 1.5 ml of toluene were added.The upper toluene phase was removed and absorbance measured at 515 nm.Amounts of proline were estimated using a standard curve generated usingL-proline and normalized to tissue dry weight.

Measurement of Relative Water Content. Relative Water Content (RWC)indicated the amount of water that is stored within the plant tissue atany given time. It was obtained by taking the field weight of therosette minus the dry weight of the plant material and dividing by theweight of the rosette saturated with water minus the dry weight of theplant material. The resulting RWC value could be compared from plant toplant, regardless of plant size.

${{Relative}\mspace{14mu} {Water}\mspace{14mu} {Content}} = {\frac{{{Field}\mspace{14mu} {Weight}} - {{Dry}\mspace{14mu} {Weight}}}{{{Turgid}\mspace{14mu} {Weight}} - {{Dry}\mspace{14mu} {Weight}}} \times 100}$

After tissue had been removed for array and ABA/proline analysis, therosette was cut from the roots using a small pair of scissors. The fieldweight was obtained by weighing the rosette. The rosette was thenimmersed in cold water and placed in an ice water bath in the dark. Thepurpose of this was to allow the plant tissue to take up water whilepreventing any metabolism which could alter the level of small moleculeswithin the cell. The next day, the rosette was carefully removed,blotted dry with tissue paper, and weighed to obtain the turgid weight.Tissue was then frozen, lyophilized, and weighed to obtain the dryweight.

Starch determination. Starch was estimated using a simple iodine basedstaining procedure. Young, fully expanded leaves were harvested eitherat the end or beginning of a 12 hour light period and placed in tubescontaining 80% ethanol or 100% methanol. Leaves were decolorized byincubating tubes in a 70° to 80° C. water bath until chlorophyll hadbeen removed from leaf tissue. Leaves were then immersed in water todisplace any residual methanol which may be present in the tissue.Starch was then stained by incubating leaves in an iodine stain (2 g KI,1 g 12 in 100 ml water) for one min and then washing with copiousamounts of water. Tissue containing large amounts of starch stained darkblue or black; tissues depleted in starch were colorless.

Chlorophyll/carotenoid determination. For some experiments, chlorophyllwas estimated in methanolic extracts using the method of Porra et al.(1989). Carotenoids were estimated in the same extract at 450 nm usingan A(1%) of 2500. We measured chlorophyll using a Minolta SPAD-502(Konica Minolta Sensing Americas, Inc., Ramsey, N.J.). When the SPAD-502was used to measure chlorophyll, both carotenoid and chlorophyll contentand amount could also be determined via HPLC. Pigments were extractedfrom leave tissue by homogenizing leaves in acetone:ethyl acetate (3:2).Water was added, the mixture centrifuged, and the upper phase removedfor HPLC analysis. Samples were analyzed using a Zorbax (AgilentTechnologies, Palo Alto, Calif.) C18 (non-endcapped) column (250×4.6)with a gradient of acetonitrile:water (85:15) to acetonitrile:methanol(85:15) in 12.5 minutes. After holding at these conditions for twominutes, solvent conditions were changed to methanol:ethyl acetate(68:32) in two minutes.

Carotenoids and chlorophylls were quantified using peak areas andresponse factors calculated using lutein and beta-carotene as standards.

Nuclear and cytoplasmically-enriched fractions. We developed a platformto prepare nuclear and cytoplasmic protein extracts in a 96-well formatusing a tungsten carbide beads for cell disruption in a mild detergentand a sucrose cushion to separate cytoplasmic from nuclear fractions. Weused histone antibodies to demonstrate that this method effectivelyseparates cytoplasmic from nuclear-enriched fractions. An alternatemethod (spun only) used the same disruption procedure, but simplypelleted the nuclei to separate them from the cytoplasm without theadded purification of a sucrose cushion.

Quantification of mRNA level. Three shoot and three root biologicalreplicates were typically harvested for each line, as described above inthe protein quantification methods section. RNA was prepared using a96-well format protocol, and cDNA synthesized from each sample. Thesepreparations were used as templates for RT-PCR experiments. We measuredthe levels of transcript for a gene of interest relative to 18S RNAtranscript for each sample using an ABI 7900 Real-Time RT-PCR machinewith SYBR® Green technology (Applied Biosystems, Foster City, Calif.).

Phenotypic Analysis: Flowering time. Plants were grown in soil.Flowering time was determined based on either or both of (i) number todays after planting to the first visible flower bud. (ii) the totalnumber of leaves (rosette or rosette plus cauline) produced by theprimary shoot meristem.

Phenotypic Analysis: Heat stress. In preliminary experiments describedin this report, plants were germinated growth chamber at 30° C. with 24hour light for 11 days. Plants were allowed to recover in 22° C. with 24hour light for three days, and photographs were taken to record healthafter the treatment. In a second experiment, seedlings were grown at 22°C. for four days on selective media, and the plates transferred to 32°C. for one week. They were then allowed to recover at 22° C. for threedays. Forty plants from two separate plates were harvested for eachline, and both fresh weight and chlorophyll content measured.

Phenotypic Analysis: dark-induced senescence. In preliminary experimentsdescribed in this report, plants were grown on soil for 27-30 days in 12h light at 22° C. They were moved to a dark chamber at 22° C., andvisually evaluated for senescence after 10-13 days. In some cases weused Fv/Fm as a measure of chlorophyll (Pourtau et al., 2004) on theyoungest most fully-expanded leaf on each plant. The Fv/Fm mean for the12 plants from each line was normalized to the Fv/Fm mean for the 12matched controls.

Microscopy. Light microscopy, electron and confocal microscopy wereperformed.

Various Definitions/Abbreviations Used

RWC=Relative water content (field wt.−dry weight)/(turgid wt.−drywt.)×100ABA=Abscisic acid, μg/gdwProline=Proline, μmole/gdwChl SPAD=Chlorophyll estimated by a Minolta SPAD-502, ratio of 650 nm to940 nmA 300=net assimilation rate, pmole CO₂/m²/s at 300 ppm CO₂A 1000=net assimilation rate, pmole CO₂/m²/sat 1000 ppm CO₂Total Chl=mg/gfw, estimated by HPLCCarot=mg/gfw, estimated by HPLCFo=minimal fluorescence of a dark adapted leafFm=maximal fluorescence of a dark adapted leafFo′=minimal fluorescence of a light adapted leafFm′=maximal fluorescence of a light adapted leafFs=steady state fluorescence of a light adapted leafPsi lf=water potential (Mpa) of a leafPsi p=turgor potential (Mpa) of a leafPsi pi=osmotic potential (Mpa) of a leafFv/Fm=(Fm−Fo)/Fm; maximum quantum yield of PSIIFv′/Fm′=(Fm′−Fo′)/Fm′; efficiency of energy harvesting by open PSIIreaction centersPhiPS2=(Fm′−Fs)/Fm′, actual quantum yield of PSIIETR=PhiPS2×light intensity absorbed ×0.5; we use 100 μE/m²/s for anaverage light intensity and 85% as the amount of light absorbedqP=(Fm′−Fsy(Fm′−Fo′); photochemical quenching (includes photosynthesisand photorespiration); proportion of open PSIIqN=(Fm−Fm′y(Fm−Fo′); non-photochemical quenching (includes mechanismslike heat dissipation)NPQ=(Fm−Fm′)/Fm′; non-photochemical quenching (includes mechanisms likeheat dissipation)

Screening for Water Use Efficiency

An aspect of this invention provides transgenic plants with enhancedyield resulting from enhanced water use efficiency and/or waterdeprivation tolerance.

This example describes a high-throughput method for greenhouse selectionof transgenic plants to wild type plants (tested as inbreds or hybrids)for water use efficiency. This selection process imposed threedrought/re-water cycles on the plants over a total period of 15 daysafter an initial stress free growth period of 11 days. Each cycleconsisted of five days, with no water being applied for the first fourdays and a water quenching on the fifth day of the cycle. The primaryphenotypes analyzed by the selection method were the changes in plantgrowth rate as determined by height and biomass during a vegetativedrought treatment. The hydration status of the shoot tissues followingthe drought was also measured. The plant heights were measured at threetime points. The first was taken just prior to the onset drought whenthe plant was 11 days old, which was the shoot initial height (SIH). Theplant height was also measured halfway throughout the drought/re-waterregimen, on day 18 after planting, to give rise to the shoot mid-droughtheight (SMH). Upon the completion of the final drought cycle on day 26after planting, the shoot portion of the plant was harvested andmeasured for a final height, which was the shoot wilt height (SWH) andalso measured for shoot wilted biomass (SWM). The shoot was placed inwater at 40° C. in the dark. Three days later, the weight of the shootwas determined to provide the shoot turgid weight (STM). After drying inan oven for four days, the weights of the shoots were determined toprovide shoot dry biomass (SDM). The shoot average height (SAH) was themean plant height across the three height measurements. If desired, theprocedure described above may be adjusted for +/−˜one day for each step.To correct for slight differences between plants, a size correctedgrowth value was derived from SIH and SWH. This was the Relative GrowthRate (RGR). Relative Growth Rate (RGR) was calculated for each shootusing the formula [RGR %=(SWH−SIH)/((SWH+SIH)/2)*100]. Relative watercontent (RWC) is a measurement of how much (%) of the plant was water atharvest. Water Content (RWC) was calculated for each shoot using theformula [RWC %=(SWM−SDM)/(STM−SDM)*100]. For example, fully watered cornplants of this stage of development have around 98% RWC.

Example VII. Morphological Observations with G1988 and Related SequenceOverexpressors in Arabidopsis

In our earlier studies, overexpression of G1988 in Arabidopsis produceda small number of lines that flowered early, and in severaloverexpressing lines seedlings grew faster than control seedlings. Wealso demonstrated that, when grown on phosphate-free media, all lines ofArabidopsis seedlings constitutively overexpressing G1988 under theregulatory control of the 35S promoter appeared larger and had more rootgrowth than controls. 35S::G1988 plants with high levels of G1988expression produced long hypocotyls, long petioles, and upright leaves,phenotypes that suggest a role for this gene in light signaling, whichmay be one of the factors responsible for conferring increased yield incrop plants. 35S::G1988 lines showed additional striking phenotypes whengrown under long days (16 hr light) or continuous light; the plants werestunted and displayed premature chlorosis and delayed development. Inaddition, occasional water-soaking of leaves was noted.

For the present study, fifty-one new 35S::G1988 direct promoter fusionlines were generated. Nine of these lines showed a long hypocotylphenotype in the T1 generation. Ten lines that had not shown longhypocotyls in the T1 were examined in the T2 generation, and six ofthese lines showed at least some plants with long hypocotyls and longpetioles, suggesting that the penetrance of the phenotype may beinfluenced by gene dosage or environmental conditions. The majority ofT1 lines examined exhibited upraised leaves. Effects on flowering wereinconsistent; some T1 lines were again noted to flower early, butcareful characterization of two 35S::G1988 lines with high G1988expression levels revealed either no difference in flowering or a slightdelay, depending on the day length in which the plants were grown.

Morphological Similarities Conferred by G1988 and Orthologs

35S::G4000 (maize SEQ ID NOs: 7 and 8), 35S::G4012 (rice SEQ ID NOs: 15and 16), 35S::G4299 (tomato SEQ ID NOs: 21 and 22), 35S::G4004 (soy SEQID NOs: 3 and 4) and 35S::G4005 (soy SEQ ID NOs: 5 and 6) lines showedsimilar morphology to 35S::G1988 lines. A number of 35S::G4012,35S::G4004 and 35S::G4005 T1 seedlings had extended petioles oncotyledons, and 35S::G4000, 35S::G4012, 35S::G4299, and 35S::G4004seedlings also had longer hypocotyls than controls under continuouslight. Adult 35S::G4004 and 35S::G4005 plants also appeared very similarto high-expressing 35S::G1988 plants when grown under continuous light.When constitutively overexpressed, all of these sequences producedplants that had upright leaves, similar to the continuous light grown35S::G1988 plants. The observations of upheld leaves, long hypocotylsand long petioles suggest that G4004 and G4005 function similarly toG1988 in light signaling, which may be a factor that can contribute toimproved yield in G1988 clade-overexpressing plants. A number of35S::G4004 lines were late in their development relative to thecontrols.

Of the twenty transgenic lines examined, one of the 35S::G4005 lines waslarger in size than controls at the seedling stage, another line waswild-type in size, and all other lines were smaller in size thancontrols at this stage.

Effect of Ectopic Expression of G1988 on Early Season Growth

Constitutive overexpression of G1988 in soybean plants resulted inconsistent increases in early season growth relative to control plants.This effect was particularly evident when the seeds of theoverexpressors and controls were planted in late as opposed to earlyspring. In particular, lines of G1988 overexpressors that wereassociated with high yield, such as lines 178, 189, 200, 209, 213 and218 (see, for example, Table 12) generally exhibited greater earlyseason growth than controls.

Effect of Ectopic Expression of G1988 on Stem Diameter in Soy Plants

When grown in controlled short day conditions (10 hours of light), linesof soybean plants overexpressing G1988 did not appear to show increasedstem diameters relative to control plants to any significant extent.However, at long day lengths (20 hours of light), G1988 overexpressorsgenerally produced significantly greater stem diameter than controls.Increased stem diameters of G1988 overexpressors were confirmed insoybean plants grown in field conditions. Increased stem diameter canpositively impact biomass as well as contribute to increased resistanceto lodging.

TABLE 5 Soybean stem diameters of various G1988 overexpressors andcontrols grown at short and long day lengths Difference from controls,Day Average stem average stem Line length diameter (mm) diameter (mm)P-value  206** Short day 4.35 −0.47 0.025 178 Short day 4.43 −0.39 0.049218 Short day 4.60 −0.22 0.250 A3244 Short day 4.82 — (control) 209Short day 4.89 +0.07 0.338 213 Short day 4.89 +0.07 0.268 A3244 Long day15.75 — (control) 178 Long day 16.83 +1.08 0.071 213 Long day 16.92 +1.07* 0.021 218 Long day 17.46  +1.71* 0.004  206** Long day 16.29+0.54 0.104 209 Long day 17.17  +1.42* 0.027 *line showed a greater stemdiameter relative to controls (significant at p < 0.05) **did notexpress G1988 to a significant level

Effect of Ectopic Expression of G1988 on Internode Length in Soy Plants

In short day experiments (10 hours of light per day), soybean internodelength increased, relative to controls. This effect was generallynoticeable for almost all of the plants' internodes, but wasparticularly conspicuous for internodes 8-12 which formed relativelylate in the plants' development (FIG. 10). However, internode length wasgenerally greater at virtually all stages of growth, including duringearly season growth as seen with the early internodes (for example,internodes 1-5) compared in FIG. 10.

Effect of Ectopic Expression of G1988 on Canopy Coverage

Constitutive overexpression of G1988 in soybean plants resulted inconsistent increases in late season canopy coverage relative to controlplants. Increased canopy coverage was positively associated with linesthat produced increased yield. Line 217, which did not overexpress asG1988 to the same extend as did the high-yielding lines (line 217ectopically expressed about 60% of the level of G1988 as generally foundin high-yielding lines), did not exhibit significantly greater canopycoverage relative to controls.

Example VIII. Plate-Based Experimental Results

This report provides experimental observations for transgenic seedlingsoverexpressing G1988-related polypeptides in plate-based assays, testingfor tolerance to abiotic stresses including water deprivation, cold, andlow nitrogen or altered C/N sensing.

G1988 (SEQ ID NO: 1 and 2; Arabidopsis thaliana)—Constitutive 35SPromoter

Plate-Based Physiology Assay Results in Arabidopsis

In our earlier studies, we demonstrated that seedlings germinated onplates that contained limited nitrogen (supplemented with glutamine)appeared less stressed than controls.

35S::G1988 plants were found to have altered performance in an assaymeasuring response to altered carbon/nitrogen ratios (C/N sensingassay). Nine out of ten 35S::G4004 lines also showed a significantlydifferent response compared to control seedlings in a C/N sensing assay,consistent with the phenotype observed for 35S::G1988 plants.

Ten 35S::G1988 Arabidopsis plant lines were examined in physiologicalassays. In addition to the C/N sensing phenotype observed in previousanalyses, enhanced performance on low nitrogen in a root growth assaywas also observed. Three out of ten lines also showed dehydrationtolerance in a plate-based severe desiccation assay, a type of waterdeprivation assay. Tolerance to sucrose (hyperosmotic stress in 9.4%sucrose) in a germination assay was also observed in six lines. Theselatter results suggested that the overexpressors would be more tolerantto other forms of water deprivation, such as drought and other relatedstresses. This supposition was confirmed by the results of a soil-baseddrought assay as noted below.

TABLE 6 G1988 (SEQ ID NO: 1 and 2 from Arabidopsis thaliana col) -Constitutive 35S Direct Promoter Fusion Germ. in Low N Sucrose ABA ColdGrowth Severe Low N low N + root Line germ. germ. germ. in colddesiccation germ. gln growth 321 + + + 322 + + + + 323 + + + 324 +325 + + 326 + + + 327 + + + 328 + + + + 329 + + + + 330 + + + + germ. =germination, gln = glutamine (+) indicates positive assay result/moretolerant or phenotype observed, relative to controls (empty cell)indicates plants overexpressing G1988 in the line in the first columnwere wild-type in their performance

In addition to the experimental results shown in Table 6, 35S::G1988seedlings were also found to be more tolerant to growth on 3%polyethylene glycol in a PEG-based hyperosmotic stress tolerance screenthan control seedlings. 35S::G1988 seedlings showed more extensive rootgrowth than controls on 3% polyethylene glycol.

Although G1988, SEQ ID NO: 1 and 2, did not confer increased coldtolerance in Arabidopsis in this set of experiments, G1988 was able toconfer greater tolerance to cold, relative to controls, in germinatingsoybean plants overexpressing the Arabidopsis G1988 protein.

G4004 (SEQ ID NO: 3 and 4 from Glycine max)—Overexpressed with theConstitutive CaMV 35S Promoter

Based on the results conducted to date, 35S::G4004 overexpressors weremore tolerant to low nitrogen conditions and demonstrated a C/N sensingphenotype In addition, seven of the 35S::G4004 lines performed betterthan control seedlings in a germination assay under cold conditions, asevidenced by less anthocyanin accumulation occurring in the transgenicplants, suggesting that this gene may also have utility in conferringimproved cold germination (Table 7). Seedlings on control germinationplates were noted to have long hypocotyls for seven out of ten linesexamined. Seedlings were also noted to be small and stunted on controlgrowth plates; given that these assays were performed under continuouslight, this phenotype was consistent with the stunting noted inmorphological assays. These transgenic plants were also more tolerant tocold during their germination than controls, as evidenced by lessanthocyanin accumulation occurring in the transgenic plants. (Table 7).

TABLE 7 G4004 (SEQ ID NO: 3 and 4 from Glycine max) - Constitutive 35SDirect Promoter Fusion Sucrose Cold Severe Low N Germ. in low Line germ.germ. desiccation germ. N + gln 301 − + + 302 + + + 303 + + + 304 + + +305 + + + 306 + + + 308 + + + 309 + + 310 + 311 + + + germ. =germination (+) indicates positive assay result/more tolerant orphenotype observed, relative to controls (empty cell) indicates plantsoverexpressing G1988 in the line in the first column were wild-type intheir performance (−) indicates a more sensitive phenotype was observedrelative to controls

Example IX. Drought Assay Results in Arabidopsis and Soybean

Water is a major limiting factor for crop yield. In water-limitedenvironments, crop yield is a function of water use, water useefficiency (WUE; defined as aerial biomass yield/water use) and theharvest index (HI; the ratio of yield biomass to the total aerialbiomass at harvest). WUE is a trait that has been proposed as acriterion for yield improvement under drought.

In a soil drought assay (a form of water deprivation assay that can beused to compare WUE), three well-characterized 35S::G1988 Arabidopsislines were examined. Two of these lines, lines 10-6-3 and 12-2-2, hadhigh levels of G1988 expression and exhibited long hypocotyls, upraisedleaves, and elongated petioles. These lines each showed enhancedrecovery from drought in one out of two assays performed. The thirdline, line 8-5-1, had lower levels of G1988 and did not exhibit thecharacteristic morphology of the other two lines. This line showed noimprovement in survival, and, in fact, performed worse in one replicateof the assay (not shown in Table 8). Nonetheless, two individual lineswere identified that did show significantly improved droughtperformance, and thus could be selected on that basis for furtherdevelopment and use as a product.

Soil Drought—Clay Pot-Based Physiology Summary.

TABLE 8 35S::G1988 drought assay results: Mean Mean drought droughtp-value for Mean Mean p-value for Project score score drought scoresurvival survival difference PID Line Type line control difference forline for control in survival P2499 10-6-3 DPF 3.1 2.2 0.29 0.55 0.410.015* P2499 10-6-3 DPF 1.9 2.4 0.28 0.39 0.37 0.81 P2499 12-2-2 DPF 2.42.8 0.58 0.41 0.48 0.28 P2499 12-2-2 DPF 2.8 2.1 0.17 0.49 0.36 0.022*DPF = direct promoter fusion project Survival = proportion of plants ineach pot that survived Drought scale: 6 (highest score) = no stresssymptoms, 0 (lowest score; most severe effect) = extreme stress symptoms*line performed better than control (significant at p < 0.11)

In addition to Arabidopsis plants, soybean plants overexpressing alsoperformed better than controls in a water use efficiency (WUE) screen.Tissue was harvested from dry locations and ¹³C/¹²C content was measuredafter combustion of plant material and conversion to CO₂, and analysisby mass spectroscopy. With comparison to a known standard, ¹³C contentwas altered in such a way as to indicate that overexpression of G1988improved water use efficiency.

Stomatal conductance was also measured. In the first field trial, threeindependent transgenic lines were found to have statisticallysignificant lower conductance. Other 35S::G1988 soybean lines testedalso had lower stomatal conductance, but the data obtained with theselines were not statistically significant. Significant differences instomatal conductance was not observed in a subsequent field trial.

Taken together, the isotope discrimination and stomatal conductanceanalysis suggest that plants overexpressing G1988 have increasedtranspiration efficiency, which indicates enhanced water use efficiencyby said plants.

A survival analysis of soybean plants overexpressing G1988 was performedusing a wilt screen assay. When analyzed against wild-type controlplants some of the lines of the transgenic lines tested showedsignificant (p<0.1) high risk score and prolonged time reaching wilting.Almost all of the eleven lines of overexpressors tested showed prolongedtime to wilting, and the differences in time to wilting for three linesas compared to controls were statistically significant (Table 9, datapresented in order of decreasing statistical significance). The only twolines that appeared to show more advanced wilting than controls (resultsnot significant) did not express G1988 to a significant degree.

Taken together, these data clearly indicated that overexpression ofG1988, SEQ ID NOs: 1 and 2, in soybean can significantly improvetolerance to water deficit conditions.

TABLE 9 Time to wilting of 35S::G1988 soy plants and controls Mean timeMean time Difference, to wilting, to wilting, time to overexpressorscontrols wilting Line (days) (days) (days) p value  651* 8.867 6.3082.559 0.0008  200* 7.933 6.308 1.625 0.0718  652* 8.615 7.333 1.2820.0834 189 8.714 8.200 0.514 0.1491 213 5.800 4.714 1.086 0.1619  217*** 6.067 4.714 1.353 0.2022  198** 6.938 8.200 −1.262 0.2174 206** 5.933 6.308 −0.375 0.3105 209 7.200 6.308 0.892 0.4200 178 8.0007.083 0.917 0.6613 218 7.600 7.083 0.517 0.9039 *line showed asignificant prolonged time to wilting relative to controls (significantat p < 0.10) **did not express G1988 to a significant level ***expressedG1988 to a lower degree than high yielding transgenic lines

Example X. Results for Cold Tolerance in Soybean

FIG. 7 displays experimental data obtained with a wild-type control lineand numerous 35S::G1988 overexpressing lines showing that G1988overexpression results in improved cold germination. The overallgermination of the control seed from this field trial conducted inWinters, Calif., represented by the dotted line in FIG. 7, was poor andit was noted that a high percentage of the seed were “hard seed”, astress-induced phenomenon that results in seeds that resist imbibitionunder standard conditions. A significantly greater percentage of G1988overexpressing seed germinated at various time points in this fieldtrial and with seed obtained in trials conducted in Illinois and Kansas.These data indicate a role for G1988 in overcoming stress responses andenhancing cell growth.

G4004 (SEQ ID NO: 4), a soy homolog of G1988 that is phylogeneticallyrelated to G1988 (FIG. 3 and FIGS. 4A-4F) was transformed into cornplants. The germination index of the corn plants overexpressing G4004was then determined. The germination index is a function of percentagegermination and rate of germination, and can be defined by the formula:

Germination index=[(T−T1+1)×P1+(T−T2+1)×(P2−P1)+(T−T3+1)×(P3−P2)+ . . .+(T−TT+1)×(PT−PT−1)]/T

where T is the number of days for which germination was tested.

P1, P2, P3, . . . and PT are the percentage of seeds germinated on dayT1, T2, T3, . . . and T.

As shown in Table 10, germination of some of the G4004-overexpressingcorn lines demonstrated the greater tolerance to cold of theoverexpressors as compared to control plants.

TABLE 10 Phenotypic data from cold germination experiments of cornplants overexpressing G4004 Germination index Trial 1 Trial 2 Line %change p value % change p value 609 −14  0.145 −20  0.073 610 −1  0.889−8 0.465 612 14  0.131 13 0.242 616 25* 0.010*  41* 0.000* 619 7 0.43638 0.001 710 28* 0.004* 45 0.000* 711 30* 0.002* 33 0.003* 117 −35 0.000** −30  0.008** The data are presented as the percentage changeover wild type controls. *Germination index significantly greater thancontrols (p < 0.05) **Germination index significantly less than controls(p < 0.05)

The present invention thus demonstrates that transformation of plants,including monocots, with a member of the G1988 cade of polypeptides canconfer to the transformed plants greater tolerance to cold conditionsthan the level of cold tolerance exhibited by control plants.

Example XI. Field Trial Results for Nitrogen Use Efficiency in Corn

A number of corn plants overexpressing the soybean G4004 polypeptidesequence (SEQ ID NO: 4) were more efficient in their use of nitrogenthan control plants, as measured by increased chlorophyll and freshshoot mass when grown in a greenhouse in low nitrogen media containing2.0 mM ammonium nitrate as the nitrogen source (Table 11).

TABLE 11 Phenotypic data from low nitrogen screen of corn plantsoverexpressing G4004 Leaf chlorophyll Shoot fresh mass Line Trial 1Trial 2 Trial 1 Trial 2 609 4.4 2.9 0.2 −2.8  610 6.1* 5.8* 0.5 1.1 6120.9 3.1  −9.4**  −6.2** 616 10.8* 3.6 2.0 −1.2  619 9.5* 6.0* 15.8* 4.9* 710 1.6 5.6* 3.1 −0.3  711 6.8* 12.4*  7.9* 5.0 117 7.0* 12.6* 9.9* 3.5 The data are presented as the percentage change over wild typecontrols. *Value significantly greater than controls at p < 0.05 **Valuesignificantly less than controls at p < 0.10

The present invention thus demonstrates that transgenic plants,including monocots, transformed with a member of the G1988 clade ofpolypeptides can confer greater tolerance to low nitrogen conditions andincreased nitrogen use efficiency to said transgenic plants, relative tothe tolerance to low nitrogen conditions and nitrogen use efficiency ofcontrol plants.

Example XII. Improved Yield in Soybean field Trials

Arabidopsis thaliana sequence G1988 (SEQ ID NOs: 1and 2), a putativetranscription factor, was shown to increase yield potential in Glycinemax(soybean). In consecutive years of broad acre yield trials,transgenic plants constitutively expressing G1988 outperformed controlcultivars, with a construct average of greater than 6% yield increase.Field observations of G1988 transgenic soybean identified severalyield-related traits that were modulated by the transgene, includingincreased height, improved early season vigor and increased estimatedstand count. G1988-overexpressing soy plants were slightly earlyflowering (less than one day as a construct average), slightly delayedin maturity (approximately one day as a construct average), and producedadditional mainstem pod-containing nodes late at the end of the season(FIG. 9).

Table 12 shows results obtained with nine 35S::G1988 soybean linestested for broad acre yield in 2004 at ten locations in the U.S., withtwo replicates per location. Each replicate was planted at a density ofnine seeds per foot in two twelve foot rows divided by a three footalley. Yield was recorded as bushels per acre and compared by spatialanalysis to a non-transformed parental control line. The G1988overexpressors showed increased yield in six of seven lines that showedsignificant expression of the transgene (Table 12). In addition toincreased in yield, several of the lines showed early flowering, delayedmaturity, and early stand count.

TABLE 12 Yield of 35S::G1988 overexpressing soy plants relative tocontrol plants in a 2004 field trial Yield mRNA expression Line(bushels/acre) p value (normalized average)  206** −5.86 0.000 19044 198** −2.88 0.043 63330  217*** −2.69 0.047 1412864 200* 0.35 0.7981972981 178* 2.4 0.077 2155338 189* 2.63 0.052 2197454 213* 3.21 0.0182088695 209* 3.63 0.007 2175037 218* 4.13 0.002 2158073 *showedsignificant increase in yield over controls **did not express G1988 to asignificant level ***expressed G1988 to a lower degree than highyielding transgenic lines

Various lines of transgenic soybean plants overexpressing G1988(35S::G1988) were also grown in field trials in 2005. In both 2004 and2005, on average, G1988 overexpressing soybean plants were somewhattaller than control plants. When yield data were averaged acrossmultiple locations, a consistent increase in yield in bushels per acre,as compared with parental line, was observed for both years (FIG. 6). Inthe 2005 field trial, G1988 overexpression significantly increased yieldin 17 of 19 locations tested. If the line shown as line 4 in FIG. 6,which unlike other lines presented in FIG. 6 graph showed little or noexpression of G1988 in leaf tissue, was removed from the statisticalanalysis, the average yield increase in 2005 was about 6.7%.

Analysis of soybean yield across three years of field trials showed thatG1988, when overexpressed in numerous transgenic lines, was able toconfer increased yield relative to controls (Table 13).

TABLE 13 Across year analysis of soybean yield of transgenic linesoverexpressing G1988 Difference relative Plant Yield to control Line(bushels/acre) (bushels/acre) % Difference P value 178* 63.8 +3.9 6.50.000 189* 63.6 +3.7 6.1 0.000 209* 63.0 +3.1 5.2 0.001 218* 62.8 +2.94.9 0.001 213* 62.6 +2.7 4.5 0.001 200* 62.2 +2.3 3.9 0.007  217*** 59.8−0.2 −0.3 0.827  206** 58.1 −1.8 −3.1 0.031 *showed significant increasein yield over controls **did not express G1988 to a significant level***expressed G1988 to a lower degree than high yielding transgenic lines

Table 14 demonstrates yet another means by which G1988 overexpressionmay increase yield in soy plants. In this table, the final stand countof transgenic and control plants from both early and late planting dateswere compared. High yielding lines demonstrated a significantly greaterfinal stand count than the control line tested under the sameconditions. In numerous instances, these results were significant atp<0.05.

TABLE 14 Across year analysis of soybean yield of transgenic linesoverexpressing G1988 Difference Final Stand Emer- from control Planting(plants gence plants time Line per plot) (%) (# plants) P value Early178 151 70 16 0.05* Early 189 147 68 11 0.15 Early 200 146 67 7 0.40Early  206** 141 65 4 0.65 Early 209 139 64 2 0.84 Early 213 150 69 160.05* Early   217*** 142 66 6 0.45 Early 218 157 73 28 0.001* EarlyControl 134 62 0 Late 178 168 78 19 0.009* Late 189 161 74 14 0.04* Late200 162 75 17 0.01* Late  206** 152 71 5 0.42 Late 209 157 73 12 0.08*Late 213 164 76 18 0.01* Late   217*** 153 71 4 0.56 Late 218 162 75 190.008* Late Control 153 71 0 *significant at p < 0.05 **did not expressG1988 to a significant level ***expressed G1988 to a lower degree thanhigh yielding transgenic lines

FIG. 11 shows the results of a plant density field trial. The soybeanplants represented in this figure that overexpressed G1988 demonstratedan observable yield increase across a wide range of plant densities,relative to control plants that either did not overexpress G1988 (shownas the unfilled circles), or control transgenic plants that did notexpress G1988 to a significant degree (shown as the filled circles).

Five lines of overexpressors are represented by the unfilled triangles,filled triangles, unfilled squares, filled squares, and asterisks. Asshown in this figure, each of the five lines expressing G1988 to asignificant degree provided a greater yield than the controls at alldensities tested, and thus, the plant stand count did not have largecontribution on harvestable yield.

One possible explanation for the increase in soy yield is an increase inpod-containing mainstem nodes relative to control plants that do notoverexpress the G1988 polypeptide. As shown in FIG. 9, when variouslines of soybean plants overexpressing a number of sequences werecompared, a considerable range of the mean number of pod-containingmainstem nodes relative to the parental control line was observed (theobserved difference for the control line was “0”, and hence isrepresented in FIG. 9 by the “0” ordinate line). The shaded bars denoteG1988 overexpressing lines, all of which produced more nodes than thecontrol, with four of the five lines producing the highest positivedifference in nodes observed.

The present invention thus demonstrates that transgenic plants,including legumes, and particularly including soybeans, transformed witha member of the G1988 clade of polypeptides can show increased yieldrelative to the yield exhibited by control plants.

Example XIII. Utilities of G1988 and its Phylogenetically-RelatedSequences for Improving Yield

Increased Abiotic Stress Tolerance May Improve Yield.

G1988 also improved stress tolerance in Arabidopsis, and earlyexperiments have shown that G1988 closely related homologs also conferimproved abiotic stress tolerance, relative to controls, to conditionssuch as cold or low nitrogen. Improved abiotic stress tolerance may havea significant impact on yield, including during periods of mild,moderate, and considerable stress.

Increased Stem Diameter May Improve Yield.

Increased stem diameter can positively impact biomass of a plant, andalso provide increased resistance to lodging.

More Secondary Rooting May Improve Yield.

Providing greater secondary rooting by transforming plants with G1988clade member sequences can confer better anchorage relative to controlplants. Transformed plants may also be produced that have the capacityto thrive in otherwise unproductive soils, such as in low nutrientenvironments, or in regions or periods of low water availability.Osmotic stress tolerance may also be mediated by increased root growth.These factors increase the effective planting range of the crop and/orincrease survival and yield.

Increasing Numbers of Mainstem Nodes May Improve Yield

The number of mainstem nodes of a variety of crops is related to theyield produced by the plant. For example, soybean and other seed-bearingcrops produce seed-bearing pods from their mainstem nodes, and thus,increasing the number of mainstem nodes has a positive impact on seednumber produced by the plant. Greater mainstem node number can alsoincrease biomass or the yield of other crops such as cotton, where bollset is related to mainstem node number.

Reduced Light Sensitivity May Improve Yield.

Light exerts its influence on many aspects of plant growth anddevelopment, including germination, greening, and flowering time. Lighttriggers inhibition of hypocotyl elongation along with greening in youngseedlings. Thus, differences in hypocotyl length are a good measure ofresponsiveness to light. Seedlings overexpressing G1988 exhibitedelongated hypocotyls in light due to reduced inhibition of hypocotylelongation. The G1988 overexpressors were also hyposensitive to blue,red and far-red wavelengths, indicating that G1988 acts downstream ofthe photoreceptors responsible for perceiving the different colors oflight. This finding indicated that adult plants overexpressing G1988 hadreduced sensitivity to the incumbent light.

Closely-related homologs of G1988 from corn (G4000, SEQ ID NO: 8),soybean (G4004, SEQ ID NO: 4), rice (G4012), and tomato (G4299, SEQ IDNO: 22), also conferred long hypocotyls when overexpressed inArabidopsis. In experiments conducted thus far, overexpression of thesoybean-derived homolog G4005, (SEQ ID NO: 6) did not cause longhypocotyls in the lines to be produced, but G4005 did confer otherindications of an altered light response such as upright petioles andleaves. Thus, there is a strong correlation between G1988 and itsorthologs from corn, soybean, rice and tomato in their ability to reducelight sensitivity, and these data indicate that G1988 and its closelyrelated homologs function similarly in signaling pathways involved inlight sensitivity. It is thus predicted that, like G1988,closely-related G1988 clade member homologs may also improve traits thatcan be affected by reduced light sensitivity. Reduced light sensitivitymay contribute to improvements in yield relative to control plants.

Greater Early Season Growth May Improve Yield.

For almost all commercial crops, it is desirable to use plants thatestablish quickly, since seedlings and young plants are particularlysusceptible to stress conditions such as salinity or disease. Since manyweeds may outgrow young crops or out-compete them for nutrients, itwould also be desirable to determine means for allowing young cropplants to out compete weed species. Increasing seedling and young plantvigor allows for crops to be planted earlier in the season with lessconcern for losses due to environmental factors.

Greater Late Season Vigor May Improve Yield.

Constitutive expression of G1988 significantly improved late seasongrowth and vigor in soybeans. G1988 overexpressors had an increase inpod-containing mainstem nodes, greater plant height, and consistentincreases in late season canopy coverage. These differences relative tocontrol or untransformed plants may have had a significant positiveimpact on yield.

Because of the observed morphological, physiological and stresstolerance similarities between G1988 and its close-related homologs, thepolypeptide members of the G1988 clade, including the sequencespresented in Table 1 and the Sequence Listing, are expected to increaseyield, crop quality, and/or growth range, and decrease fertilizer and/orwater usage in a variety of crop plants, ornamental plants, and woodyplants used in the food, ornamental, paper, pulp, lumber or otherindustries.

Example XIV. Transformation of Eudicots to Produce Increased Yieldand/or Abiotic Stress Tolerance

Crop species that overexpress polypeptides of the invention may produceplants with increased water deprivation, cold and/or nutrient toleranceand/or yield in both stressed and non-stressed conditions. Thus,polynucleotide sequences listed in the Sequence Listing recombined into,for example, one of the expression vectors of the invention, or anothersuitable expression vector, may be transformed into a plant for thepurpose of modifying plant traits for the purpose of improving yieldand/or quality. The expression vector may contain a constitutive,tissue-specific or inducible promoter operably linked to thepolynucleotide. The cloning vector may be introduced into a variety ofplants by means well known in the art such as, for example, direct DNAtransfer or Agrobacterium tumefaciens-mediated transformation. It is nowroutine to produce transgenic plants using most eudicot plants (seeWeissbach and Weissbach (1989); Gelvin et al. (1990); Herrera-Estrellaet al. (1983); Bevan (1984); and Klee (1985)). Methods for analysis oftraits are routine in the art and examples are disclosed above.

Numerous protocols for the transformation of tomato and soy plants havebeen previously described, and are well known in the art. Gruber et al.(1993), in Glick and Thompson (1993) describe several expression vectorsand culture methods that may be used for cell or tissue transformationand subsequent regeneration. For soybean transformation, methods aredescribed by Miki et al. (1993); and U.S. Pat. No. 5,563,055, (Townsendand Thomas), issued Oct. 8, 1996.

There are a substantial number of alternatives to Agrobacterium-mediatedtransformation protocols, other methods for the purpose of transferringexogenous genes into soybeans or tomatoes. One such method ismicroprojectile-mediated transformation, in which DNA on the surface ofmicroprojectile particles is driven into plant tissues with a biolisticdevice (see, for example, Sanford et al. (1987); Christou et al. (1992);Sanford (1993); Klein et al. (1987); U.S. Pat. No. 5,015,580 (Christouet al), issued May 14, 1991; and U.S. Pat. No. 5,322,783 (Tomes et al.),issued Jun. 21, 1994).

Alternatively, sonication methods (see, for example, Zhang et al.(1991)); direct uptake of DNA into protoplasts using CaCl₂precipitation, polyvinyl alcohol or poly-L-ornithine (see, for example,Hain et al. (1985); Draper et al. (1982)); liposome or spheroplastfusion (see, for example, Deshayes et al. (1985); Christou et al.(1987)); and electroporation of protoplasts and whole cells and tissues(see, for example, Donn et al. (1990); D'Halluin et al. (1992); andSpencer et al. (1994)) have been used to introduce foreign DNA andexpression vectors into plants.

After a plant or plant cell is transformed (and the transformed hostplant cell then regenerated into a plant), the transformed plant may becrossed with itself or a plant from the same line, a non-transformed orwild-type plant, or another transformed plant from a differenttransgenic line of plants. Crossing provides the advantages of producingnew and often stable transgenic varieties. Genes and the traits theyconfer that have been introduced into a tomato or soybean line may bemoved into distinct line of plants using traditional backcrossingtechniques well known in the art. Transformation of tomato plants may beconducted using the protocols of Koornneef et al (1986), and in U.S.Pat. No. 6,613,962, the latter method described in brief here. Eight dayold cotyledon explants are precultured for 24 hours in Petri dishescontaining a feeder layer of Petunia hybrida suspension cells plated onMS medium with 2% (w/v) sucrose and 0.8% agar supplemented with 10 μMα-naphthalene acetic acid and 4.4 μM 6-benzylaminopurine. The explantsare then infected with a diluted overnight culture of Agrobacteriumtumefaciens containing an expression vector comprising a polynucleotideof the invention for 5-10 minutes, blotted dry on sterile filter paperand cocultured for 48 hours on the original feeder layer plates. Cultureconditions are as described above. Overnight cultures of Agrobacteriumtumefaciens are diluted in liquid MS medium with 2% (w/v/) sucrose, pH5.7) to an OD₆₀₀ of 0.8.

Following cocultivation, the cotyledon explants are transferred to Petridishes with selective medium comprising MS medium with 4.56 μM zeatin,67.3 μM vancomycin, 418.9 μM cefotaxime and 171.6 μM kanamycin sulfate,and cultured under the culture conditions described above. The explantsare subcultured every three weeks onto fresh medium. Emerging shoots aredissected from the underlying callus and transferred to glass jars withselective medium without zeatin to form roots. The formation of roots ina kanamycin sulfate-containing medium is a positive indication of asuccessful transformation.

Transformation of soybean plants may be conducted using the methodsfound in, for example, U.S. Pat. No. 5,563,055 (Townsend et al., issuedOct. 8, 1996), described in brief here. In this method soybean seed issurface sterilized by exposure to chlorine gas evolved in a glass belljar. Seeds are germinated by plating on 1/10 strength agar solidifiedmedium without plant growth regulators and culturing at 28° C. with a 16hour day length. After three or four days, seed may be prepared forcocultivation. The seedcoat is removed and the elongating radicleremoved 3-4 mm below the cotyledons.

Overnight cultures of Agrobacterium tumefaciens harboring the expressionvector comprising a polynucleotide of the invention are grown to logphase, pooled, and concentrated by centrifugation.

Inoculations are conducted in batches such that each plate of seed wastreated with a newly resuspended pellet of Agrobacterium. The pelletsare resuspended in 20 ml inoculation medium. The inoculum is poured intoa Petri dish containing prepared seed and the cotyledonary nodes aremacerated with a surgical blade. After 30 minutes the explants aretransferred to plates of the same medium that has been solidified.Explants are embedded with the adaxial side up and level with thesurface of the medium and cultured at 22° C. for three days under whitefluorescent light. These plants may then be regenerated according tomethods well established in the art, such as by moving the explantsafter three days to a liquid counter-selection medium (see U.S. Pat. No.5,563,055).

The explants may then be picked, embedded and cultured in solidifiedselection medium. After one month on selective media transformed tissuebecomes visible as green sectors of regenerating tissue against abackground of bleached, less healthy tissue. Explants with green sectorsare transferred to an elongation medium. Culture is continued on thismedium with transfers to fresh plates every two weeks. When shoots are0.5 cm in length they may be excised at the base and placed in a rootingmedium.

Example XV: Transformation of Monocots to Produce Increased Yield orAbiotic Stress Tolerance

Cereal plants such as, but not limited to, corn, wheat, rice, sorghum,or barley, may be transformed with the present polynucleotide sequences,including monocot or eudicot-derived sequences such as those presentedin the present Tables, cloned into a vector such as pGA643 andcontaining a kanamycin-resistance marker, and expressed constitutivelyunder, for example, the CaMV 35S or COR15 promoters, or withtissue-specific or inducible promoters. The expression vectors may beone found in the Sequence Listing, or any other suitable expressionvector may be similarly used. For example, pMEN020 may be modified toreplace the NptII coding region with the BAR gene of Streptomyceshygroscopicus that confers resistance to phosphinothricin. The KpnI andBgII sites of the Bar gene are removed by site-directed mutagenesis withsilent codon changes.

The cloning vector may be introduced into a variety of cereal plants bymeans well known in the art including direct DNA transfer orAgrobacterium tumefaciens-mediated transformation. The latter approachmay be accomplished by a variety of means, including, for example, thatof U.S. Pat. No. 5,591,616, in which monocotyledon callus is transformedby contacting dedifferentiating tissue with the Agrobacterium containingthe cloning vector.

The sample tissues are immersed in a suspension of 3×10⁻⁹ cells ofAgrobacterium containing the cloning vector for 3-10 minutes. The callusmaterial is cultured on solid medium at 25° C. in the dark for severaldays. The calli grown on this medium are transferred to Regenerationmedium. Transfers are continued every 2-3 weeks (2 or 3 times) untilshoots develop. Shoots are then transferred to Shoot-Elongation mediumevery 2-3 weeks. Healthy looking shoots are transferred to rootingmedium and after roots have developed, the plants are placed into moistpotting soil.

The transformed plants are then analyzed for the presence of the NPTIIgene/kanamycin resistance by ELISA, using the ELISA NPTII kit from5Prime-3Prime Inc. (Boulder, Colo.).

It is also routine to use other methods to produce transgenic plants ofmost cereal crops (Vasil (1994)) such as corn, wheat, rice, sorghum(Cassas et al. (1993)), and barley (Wan and Lemeaux (1994)). DNAtransfer methods such as the microprojectile method can be used for corn(Fromm et al. (1990); Gordon-Kamm et al. (1990); Ishida (1990)), wheat(Vasil et al. (1992); Vasil et al. (1993); Weeks et al. (1993)), andrice (Christou (1991); Hiei et al. (1994); Aldemita and Hodges (1996);and Hiei et al. (1997)). For most cereal plants, embryogenic cellsderived from immature scutellum tissues are the preferred cellulartargets for transformation (Hiei et al. (1997); Vasil (1994)). Fortransforming corn embryogenic cells derived from immature scutellartissue using microprojectile bombardment, the A188XB73 genotype is thepreferred genotype (Fromm et al. (1990); Gordon-Kamm et al. (1990)).After microprojectile bombardment the tissues are selected onphosphinothricin to identify the transgenic embryogenic cells(Gordon-Kamm et al. (1990)). Transgenic plants from transformed hostplant cells may be regenerated by standard corn regeneration techniques(Fromm et al. (1990); Gordon-Kamm et al. (1990)).

Example XVI: Expression and Analysis of Increased Yield or AbioticStress Tolerance in Non-Arabidopsis Species

Since G1988 closely-related homologs, derived from various diverse plantspecies, that have been overexpressed in plants have the same functionsof conferring increased yield, similar morphologies, reducing lightsensitivity, and increasing abiotic stress tolerance, includingtolerance to cold during germination and low nitrogen conditions, it isexpected that structurally similar orthologs of the G1988 clade ofpolypeptide sequences, including those found in the Sequence Listing,can confer increased yield, and/or increased tolerance to a number ofabiotic stresses, including water deprivation, cold, and low nitrogenconditions, relative to control plants. As sequences of the inventionhave been shown to increase yield or improve stress tolerance in avariety of plant species, it is also expected that these sequences willincrease yield of crop or other commercially important plant species.

Northern blot analysis, RT-PCR or microarray analysis of theregenerated, transformed plants may be used to show expression of apolypeptide or the invention and related genes that are capable ofinducing abiotic stress tolerance, and/or larger size.

After a eudicot plant, monocot plant or plant cell has been transformed(and the latter plant host cell regenerated into a plant) and shown tohave greater size, improved planting density, that is, able to tolerategreater planting density with a coincident increase in yield, improvedlate season vigor, or improved tolerance to abiotic stress, or producegreater yield relative to a control plant under the stress conditions,the transformed monocot plant may be crossed with itself or a plant fromthe same line, a non-transformed or wild-type monocot plant, or anothertransformed monocot plant from a different transgenic line of plants.

The function of specific polypeptides of the invention, includingclosely-related orthologs, have been analyzed and may be furthercharacterized and incorporated into crop plants. The ectopicoverexpression of these sequences may be regulated using constitutive,inducible, or tissue specific regulatory elements. Genes that have beenexamined and have been shown to modify plant traits (includingincreasing yield and/or abiotic stress tolerance) encode polypeptidesfound in the Sequence Listing. In addition to these sequences, it isexpected that newly discovered polynucleotide and polypeptide sequencesclosely related to polynucleotide and polypeptide sequences found in theSequence Listing can also confer alteration of traits in a similarmanner to the sequences found in the Sequence Listing, when transformedinto any of a considerable variety of plants of different species, andincluding dicots and monocots. The polynucleotide and polypeptidesequences derived from monocots (e.g., the rice sequences) may be usedto transform both monocot and dicot plants, and those derived fromdicots (e.g., the Arabidopsis and soy genes) may be used to transformeither group, although it is expected that some of these sequences willfunction best if the gene is transformed into a plant from the samegroup as that from which the sequence is derived.

As an example of a first step to determine water deprivation-relatedtolerance, seeds of these transgenic plants may be subjected togermination assays to measure sucrose sensing, severe desiccation ordrought. The methods for sucrose sensing, severe desiccation or droughtassays are described above. Plants overexpressing sequences of theinvention may be found to be more tolerant to high sucrose by havingbetter germination, longer radicles, and more cotyledon expansion.

Sequences of the invention, that is, members of the G1988 clade, mayalso be used to generate transgenic plants that are more tolerant to lownitrogen conditions or cold than control plants.

All of these abiotic stress tolerances conferred by G1988 may contributeto increased yield of commercially available plants. However, G1988overexpressors have been shown to increase yield of plants in theapparent absence of significant of obvious abiotic stress, as evidencedby including increased height, increased early season vigor andestimated stand count, and decreased early season canopy coverageobserved in soy plants overexpressing G1988. Thus, it is thus expectedthat members of the G1988 clade will improve yield in plants relative tocontrol plants, including in leguminous species, even in the absence ofovert abiotic stresses.

Plants that are more tolerant than controls to water deprivation assays,low nitrogen conditions or cold are greener, more vigorous will havebetter survival rates than controls, or will recover better from thesetreatments than control plants.

It is expected that the same methods may be applied to identify otheruseful and valuable sequences of the present polypeptide clades, and thesequences may be derived from a diverse range of species.

REFERENCES CITED

-   Aldemita and Hodges (1996) Planta 199: 612-617-   Alia et al. (1998) Plant J. 16: 155-161-   Altschul (1990) J. Mol. Biol. 215: 403-410-   Altschul (1993) J. Mol. Evol. 36:290-300-   Anderson and Young (1985) “Quantitative Filter Hybridisation”, In:    Hames and Higgins, ed., Nucleic Acid Hybridisation. A Practical    Approach. Oxford, IRL Press, 73-111-   Ausubel et al. (1997) Short Protocols in Molecular Biology, John    Wiley & Sons, New York, N.Y., unit 7.7-   Bairoch et al. (1997) Nucleic Acids Res. 25: 217-221-   Bates et al. (1973) Plant Soil 39: 205-207-   Bates and Lynch (1996) Plant Cell Environ. 19: 529-538-   Bechtold and Pelletier (1998) Methods Mol. Biol. 82: 259-266-   Berger and Kimmel (1987), “Guide to Molecular Cloning Techniques”,    in Methods in Enzymology, vol. 152, Academic Press, Inc., San Diego,    Calif.-   Bevan (1984) Nucleic Acids Res. 12: 8711-8721-   Bhattacharjee et al. (2001) Proc. Natl. Acad. Sci. USA 98:    13790-13795-   Borden (1998) Biochem. Cell Biol. 76: 351-358-   Borevitz et al. (2000) Plant Cell 12: 2383-2393-   Boss and Thomas (2002) Nature, 416: 847-850-   Bruce et al. (2000) Plant Cell 12: 65-79-   Cassas et al. (1993) Proc. Natl. Acad. Sci. USA 90: 11212-11216-   Chase et al. (1993) Ann. Missouri Bot. Gard. 80: 528-580-   Cheikh et al. (2003) U.S. Patent Application No. 20030101479-   Cheng et al. (1992) Proc Natl Acad Sci USA 89: 1861-1864-   Christou et al. (1987) Proc. Natl. Acad. Sci. USA 84: 3962-3966-   Christou (1991) Bio/Technol. 9:957-962-   Christou et al. (1992) Plant. J. 2: 275-281-   Coruzzi et al. (2001) Plant Physiol. 125: 61-64-   Coupland (1995) Nature 377: 482-483-   Crawford (1995) Plant Cell 7: 859-886-   Daly et al. (2001) Plant Physiol. 127: 1328-1333-   Daniel-Vedele et al. (1996) CR Acad Sci Paris 319: 961-968-   Deshayes et al. (1985) EMBO J., 4: 2731-2737-   D'Halluin et al. (1992) Plant Cell 4: 1495-1505-   Donn et al. (1990) in Abstracts of VIIth International Congress on    Plant Cell and Tissue Culture IAPTC, A2-38: 53-   Doolittle, ed. (1996) Methods in Enzymology, vol. 266: “Computer    Methods for Macromolecular Sequence Analysis” Academic Press, Inc.,    San Diego, Calif., USA-   Draper et al. (1982) Plant Cell Physiol. 23: 451-458-   Eddy (1996) Curr. Opin. Str. Biol. 6: 361-365-   Eisen (1998) Genome Res. 8: 163-167-   Feng and Doolittle (1987) J. Mol. Evol. 25: 351-360-   Fowler and Thomashow (2002) Plant Cell 14: 1675-1690-   Fromm et al. (1990) Bio/Technol. 8: 833-839-   Fu et al. (2001) Plant Cell 13: 1791-1802-   Gelvin et al. (1990) Plant Molecular Biology Manual, Kluwer Academic    Publishers-   Glantz (2001) Relative risk and risk score, in Primer of    Biostatistics. 5^(th) ed., McGraw Hill/Appleton and Lange,    publisher.-   Gilmour et al. (1998) Plant J. 16: 433-442-   Gruber et al., in Glick and Thompson (1993) Methods in Plant    Molecular Biology and Biotechnology. eds., CRC Press, Inc., Boca    Raton-   Goodrich et al. (1993) Cell 75: 519-530-   Gordon-Kamm et al. (1990) Plant Cell 2: 603-618-   Haake et al. (2002) Plant Physiol. 130: 639-648-   Hain et al. (1985) Mol. Gen. Genet. 199: 161-168-   Harrison (1999) Annu. Rev. Plant Physiol. Plant Mol. Biol. 50:    361-389-   Haymes et al. (1985) Nucleic Acid Hybridization: A Practical    Approach, IRL Press, Washington, D.C.-   He et al. (2000) Transgenic Res. 9: 223-227-   Hein (1990) Methods Enzymol. 183: 626-645-   Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915-   Henikoff and Henikoff (1991) Nucleic Acids Res. 19: 6565-6572-   Herrera-Estrella et al. (1983) Nature 303: 209-   Hiei et al. (1994) Plant J. 6:271-282-   Hiei et al. (1997) Plant Mol. Biol. 35:205-218-   Higgins and Sharp (1988) Gene 73: 237-244-   Higgins et al. (1996) Methods Enzymol. 266: 383-402-   Hosmer and Lemeshow (1999) Applied Survival Analysis: regression    Modeling of Time to Event Data. John Wiley & Sons, Inc. Publisher.-   Ishida (1990) Nature Biotechnol. 14:745-750-   Jaglo et al. (2001) Plant Physiol. 127: 910-917-   Jang et al. (1997) Plant Cell 9: 5-19-   Kashima et al. (1985) Nature 313: 402-404-   Kim et al. (2001) Plant J. 25: 247-259-   Kimmel (1987) Methods Enzymol. 152: 507-511-   Klee (1985) Bio/Technology 3: 637-642-   Klein et al. (1987) Nature 327: 70-73-   Koornneef et al (1986) In Tomato Biotechnology: Alan R. Liss, Inc.,    169-178-   Ku et al. (2000) Proc. Nat. Acad. Sci. USA 97: 9121-9126-   Kyozuka and Shimamoto (2002) Plant Cell Physiol. 43: 130-135-   Leon-Kloosterziel et al. (1996) Plant Physiol. 110: 233-240-   Lin et al. (1991) Nature 353: 569-571-   Liu and Zhu (1997) Proc. Natl. Acad. Sci. USA 94: 14960-14964-   Mandel (1992a) Nature 360: 273-277-   Mandel et al. (1992b) Cell 71-133-143-   Meyers (1995) Molecular Biology and Biotechnology, Wiley VCH, New    York, N.Y., p 856-853-   Miki et al. (1993) in Methods in Plant Molecular Biology and    Biotechnology, p. 67-88, Glick and Thompson, eds., CRC Press, Inc.,    Boca Raton-   Mount (2001), in Bioinformatics: Sequence and Genome Analysis, Cold    Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., p. 543-   Miller et al. (2001) Plant J. 28: 169-179-   Nandi et al. (2000) Curr. Biol. 10: 215-218-   Peng et al. (1997) Genes Development 11: 3194-3205)-   Peng et al. (1999) Nature 400: 256-261-   Porra et al. (1989) Biochim. Biophys. Acta: 975, 384-394-   Pourtau et al., (2004) Planta 219: 765-772-   Putterill et al. (1995) Cell 80: 847-857-   Ratcliffe et al. (2001) Plant Physiol. 126: 122-132-   Reeves and Nissen (1990) J. Biol. Chem. 265, 8573-8582-   Reeves and Nissen (1995) Prg. Cell Cycle Res. 1: 339-349-   Riechmann et al. (2000a) Science 290, 2105-2110-   Riechmann, J. L., and Ratcliffe, O J. (2000b) Curr. Opin. Plant    Biol. 3, 423-434-   Rieger et al. (1976) Glossary of Genetics and Cytogenetics:    Classical and Molecular, 4th ed., Springer Verlag, Berlin-   Robson et al. (2001) Plant J. 28: 619-631-   Sadowski et al. (1988) Nature 335: 563-564-   Saleki et al. (1993) Plant Physiol. 101: 839-845-   Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual, 2nd    Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.-   Sanford et al. (1987) Part. Sci. Technol. 5:27-37-   Sanford (1993) Methods Enzymol. 217: 483-509-   Shpaer (1997) Methods Mol. Biol. 70: 173-187-   Smeekens (1998) Curr. Opin. Plant Biol. 1: 230-234-   Smith et al. (1992) Protein Engineering 5: 35-51-   Soltis et al. (1997) Ann. Missouri Bot. Gard. 84: 1-49-   Sonnhammer et al. (1997) Proteins 28: 405-420-   Spencer et al. (1994) Plant Mol. Biol. 24: 51-61-   Stitt (1999) Curr. Opin. Plant. Biol. 2: 178-186-   Suzuki et al. (2001) Plant J. 28: 409-418-   Thompson et al. (1994) Nucleic Acids Res. 22: 4673-4680-   Torok and Etkin (2001) Differentiation 67: 63-71-   Tudge (2000) in The Variety of Life, Oxford University Press, New    York, N.Y. pp. 547-606-   Vasil et al. (1992) Bio/Technol. 10:667-674-   Vasil et al. (1993) Bio/Technol. 11:1553-1558-   Vasil (1994) Plant Mol. Biol. 25: 925-937-   Vincentz et al. (1992) Plant J 3: 315-324-   Wahl and Berger (1987) Methods Enzymol. 152: 399-407-   Wan and Lemeaux (1994) Plant Physiol. 104: 37-48-   Weeks et al. (1993) Plant Physiol. 102:1077-1084-   Weigel and Nilsson (1995) Nature 377: 482-500-   Weissbach and Weissbach (1989) Methods for Plant Molecular Biology,    Academic Press-   Wu et al. (1996) Plant Cell 8: 617-627-   Xin and Browse (1998) Proc. Natl. Acad. Sci. USA 95: 7799-7804-   Xu et al. (2001) Proc. Natl. Acad. Sci. USA 98: 15089-15094-   Zhang et al. (1991) Bi/Technology 9: 996-997-   Zhu et al. (1998) Plant Cell 10: 1181-1191

All publications and patent applications mentioned in this specificationare herein incorporated by reference to the same extent as if eachindividual publication or patent application was specifically andindividually indicated to be incorporated by reference.

The present invention is not limited by the specific embodimentsdescribed herein. The invention now being fully described, it will beapparent to one of ordinary skill in the art that many changes andmodifications can be made thereto without departing from the spirit orscope of the Claims. Modifications that become apparent from theforegoing description and accompanying figures fall within the scope ofthe following Claims.

What is claimed is:
 1. A transgenic soybean plant having an alteredtrait relative to a control soybean plant, wherein the transgenicsoybean plant comprises a recombinant polynucleotide encoding apolypeptide that comprises, in order from n-terminus to c-terminus: (a)a conserved domain with at least 58% amino acid identity with aminoacids 5-50 of SEQ ID NO:2; and (b) SEQ ID NO: 58; wherein the controlsoybean plant does not contain the recombinant polynucleotide; andwherein expression of the polypeptide in the transgenic soybean plantconfers to the transgenic soybean plant the altered trait; and thealtered trait is selected from the group of: greater water useefficiency; improved late season vigor, an increased stand count;greater late season canopy coverage; greater internode length; a reducedpercentage of hard seed; a greater stem diameter, and an increasednumber of pod-bearing main-stem nodes.
 2. The transgenic soybean plantof claim 1, wherein the conserved domain has at least 60% identity withamino acids 5-50 of SEQ ID NO:
 2. 3. The transgenic soybean plant ofclaim 1, wherein the conserved domain has at least 85% identity withamino acids 5-50 of SEQ ID NO:
 2. 4. The transgenic soybean plant ofclaim 1, wherein the conserved domain has at least 95% identity withamino acids 5-50 of SEQ ID NO:
 2. 5. The transgenic soybean plant ofclaim 1, wherein the expression of the polypeptide is regulated by aconstitutive promoter.
 6. The transgenic soybean plant of claim 5,wherein the constitutive promoter comprises the cauliflower mosaic virus35S transcription initiation region or the rice actin transcriptioninitiation region.
 7. A transgenic seed produced from the transgenicsoybean plant of claim 1, wherein the transgenic seed comprises therecombinant polynucleotide.
 8. A method for altering a trait of asoybean plant as compared to a control soybean plant, the methodcomprising: (a) providing a recombinant polynucleotide that comprises aconstitutive promoter, and the recombinant polynucleotide encodes apolypeptide that comprises, in order from n-terminus to c-terminus: (i)a conserved domain with at least 58% amino acid identity with aminoacids 5-50 of SEQ ID NO: 2; and (ii) SEQ ID NO: 58; wherein the controlsoybean plant does not comprise the recombinant polynucleotide; and (b)introducing the recombinant polynucleotide into a target soybean plantto produce a transformed soybean plant; wherein overexpression of thepolypeptide in the transformed soybean plant confers the altered traitrelative to the control soybean plant; and wherein the altered trait isselected from the group of: greater water use efficiency; improved lateseason vigor; an increased stand count; greater late season canopycoverage; greater internode length; a reduced percentage of hard seed; agreater stem diameter; and an increased number of pod-bearing main-stemnodes.
 9. The method of claim 8, wherein the conserved domain has atleast 60% identity with amino acids 5-50 of SEQ ID NO:
 2. 10. The methodof claim 8, wherein the conserved domain has at least 85% identity withamino acids 5-50 of SEQ ID NO:
 2. 11. The method of claim 8, wherein theconserved domain has at least 95% identity with amino acids 5-50 of SEQID NO:
 2. 12. The method of claim 8, wherein the method furthercomprises the step of: (c) selecting a transgenic soybean plant by itsectopic expression of the polypeptide or by the presence of an alteredtrait of claim 8, as compared to the control soybean plant.
 13. Themethod of claim 8, wherein the method steps further comprise: (c)selfing or crossing the transformed soybean plant with itself or anotherplant, respectively, to produce a transgenic soybean seed that comprisesthe recombinant polynucleotide.
 14. The method of claim 8, wherein theexpression of the polypeptide is regulated by a constitutive promoter.15. The method of claim 14, wherein the constitutive promoter comprisesthe cauliflower mosaic virus 35S transcription initiation region or therice actin transcription initiation region.